| The Linux kernel code has been rewritten to use Unicode to map |
| characters to fonts. By downloading a single Unicode-to-font table, |
| both the eight-bit character sets and UTF-8 mode are changed to use |
| the font as indicated. |
| |
| This changes the semantics of the eight-bit character tables subtly. |
| The four character tables are now: |
| |
| Map symbol Map name Escape code (G0) |
| |
| LAT1_MAP Latin-1 (ISO 8859-1) ESC ( B |
| GRAF_MAP DEC VT100 pseudographics ESC ( 0 |
| IBMPC_MAP IBM code page 437 ESC ( U |
| USER_MAP User defined ESC ( K |
| |
| In particular, ESC ( U is no longer "straight to font", since the font |
| might be completely different than the IBM character set. This |
| permits for example the use of block graphics even with a Latin-1 font |
| loaded. |
| |
| In accordance with the Unicode standard/ISO 10646 the range U+F000 to |
| U+F8FF has been reserved for OS-wide allocation (the Unicode Standard |
| refers to this as a "Corporate Zone", since this is inaccurate for |
| Linux we call it the "Linux Zone"). U+F000 was picked as the starting |
| point since it lets the direct-mapping area start on a large power of |
| two (in case 1024- or 2048-character fonts ever become necessary). |
| This leaves U+E000 to U+EFFF as End User Zone. |
| |
| The Unicodes in the range U+F000 to U+F1FF have been hard-coded to map |
| directly to the loaded font, bypassing the translation table. The |
| user-defined map now defaults to U+F000 to U+F1FF, emulating the |
| previous behaviour. This range may expand in the future should it be |
| warranted. |
| |
| Actual characters assigned in the Linux Zone |
| -------------------------------------------- |
| |
| In addition, the following characters not present in Unicode 1.1.4 (at |
| least, I have not found them!) have been defined; these are used by |
| the DEC VT graphics map: |
| |
| U+F800 DEC VT GRAPHICS HORIZONTAL LINE SCAN 1 |
| U+F801 DEC VT GRAPHICS HORIZONTAL LINE SCAN 3 |
| U+F803 DEC VT GRAPHICS HORIZONTAL LINE SCAN 7 |
| U+F804 DEC VT GRAPHICS HORIZONTAL LINE SCAN 9 |
| |
| The DEC VT220 uses a 6x10 character matrix, and these characters form |
| a smooth progression in the DEC VT graphics character set. I have |
| omitted the scan 5 line, since it is also used as a block-graphics |
| character, and hence has been coded as U+2500 FORMS LIGHT HORIZONTAL. |
| However, I left U+F802 blank should the need arise. |
| |
| Klingon language support |
| ------------------------ |
| |
| Unfortunately, Unicode/ISO 10646 does not allocate code points for the |
| language Klingon, probably fearing the potential code point explosion |
| if many fictional languages were submitted for inclusion. There are |
| also political reasons (the Japanese, for example, are not too happy |
| about the whole 16-bit concept to begin with.) However, with Linux |
| being a hacker-driven OS it seems this is a brilliant linguistic hack |
| worth supporting. Hence I have chosen to add it to the list in the |
| Linux Zone. |
| |
| Several glyph forms for the Klingon alphabet have been proposed. |
| However, since the set of symbols appear to be consistent throughout, |
| with only the actual shapes being different, in keeping with standard |
| Unicode practice these differences are considered font variants. |
| |
| Klingon has an alphabet of 26 characters, a positional numeric writing |
| system with 10 digits, and is written left-to-right, top-to-bottom. |
| Punctuation appears to be only used in Latin transliteration; it |
| appears customary to write each sentence on its own line, and |
| centered. Space has been reserved for punctuation should it prove |
| necessary. |
| |
| This encoding has been endorsed by the Klingon Language Institute. |
| For more information, contact them at: |
| |
| http://www.kli.org/ |
| |
| Since the characters in the beginning of the Linux CZ have been more |
| of the dingbats/symbols/forms type and this is a language, I have |
| located it at the end, on a 16-cell boundary in keeping with standard |
| Unicode practice. |
| |
| U+F8D0 KLINGON LETTER A |
| U+F8D1 KLINGON LETTER B |
| U+F8D2 KLINGON LETTER CH |
| U+F8D3 KLINGON LETTER D |
| U+F8D4 KLINGON LETTER E |
| U+F8D5 KLINGON LETTER GH |
| U+F8D6 KLINGON LETTER H |
| U+F8D7 KLINGON LETTER I |
| U+F8D8 KLINGON LETTER J |
| U+F8D9 KLINGON LETTER L |
| U+F8DA KLINGON LETTER M |
| U+F8DB KLINGON LETTER N |
| U+F8DC KLINGON LETTER NG |
| U+F8DD KLINGON LETTER O |
| U+F8DE KLINGON LETTER P |
| U+F8DF KLINGON LETTER Q |
| - Written <q> in standard Okrand Latin transliteration |
| U+F8E0 KLINGON LETTER QH |
| - Written <Q> in standard Okrand Latin transliteration |
| U+F8E1 KLINGON LETTER R |
| U+F8E2 KLINGON LETTER S |
| U+F8E3 KLINGON LETTER T |
| U+F8E4 KLINGON LETTER TLH |
| U+F8E5 KLINGON LETTER U |
| U+F8E6 KLINGON LETTER V |
| U+F8E7 KLINGON LETTER W |
| U+F8E8 KLINGON LETTER Y |
| U+F8E9 KLINGON LETTER GLOTTAL STOP |
| |
| U+F8F0 KLINGON DIGIT ZERO |
| U+F8F1 KLINGON DIGIT ONE |
| U+F8F2 KLINGON DIGIT TWO |
| U+F8F3 KLINGON DIGIT THREE |
| U+F8F4 KLINGON DIGIT FOUR |
| U+F8F5 KLINGON DIGIT FIVE |
| U+F8F6 KLINGON DIGIT SIX |
| U+F8F7 KLINGON DIGIT SEVEN |
| U+F8F8 KLINGON DIGIT EIGHT |
| U+F8F9 KLINGON DIGIT NINE |
| |
| Other Fictional and Artificial Scripts |
| -------------------------------------- |
| |
| Since the assignment of the Klingon Linux Unicode block, a registry of |
| fictional and artificial scripts has been established by John Cowan, |
| <cowan@ccil.org>. The ConScript Unicode Registry is accessible at |
| http://locke.ccil.org/~cowan/csur/; the ranges used fall at the bottom |
| of the End User Zone and can hence not be normatively assigned, but it |
| is recommended that people who wish to encode fictional scripts use |
| these codes, in the interest of interoperability. For Klingon, CSUR |
| has adopted the Linux encoding. |
| |
| H. Peter Anvin <hpa@zytor.com> |