The Japanese Input Method (JIM) provides Japanese input. The features include the following:
The Japanese code sets consist of the following character groups:
Katakana and Hiragana consist of approximately 50 characters each and form the set of phonetic characters referred to as Kana. All of the sounds in the Japanese language can be represented in Kana.
Kanji is a set of ideographs. A simple concept can be represented by a single Kanji character, while more complicated meanings can be formed with strings of Kanji characters. Several thousand Kanji characters exist.
The Japanese also use the Roman alphabet. Called Romaji, the Roman alphabet consists of 26 characters. It is used mostly in technical and professional environments to represent technical vocabulary that does not exist in Japanese. A typical sentence is usually a mixture of Katakana, Hiragana, Kanji, Romaji, numbers, and other characters.
The Japanese Industrial Standard (JIS) specifies about 7000 Kanji characters processed by computer systems. Japanese products made by this manufacturer support all of the standard characters, as well as others. Input of the characters is accomplished through the following:
The following special keys appear on the 106-key Japanese keyboard to allow for these conversions:
Special Japanese Keys | ||
Key Function | Key Name | Description of Function |
KKC Non-conversion key | muhenkan | Leaves Kana characters as is. |
KKC Conversion key | henkan | Converts Kana to Kanji. |
KKC All Candidates key | zenkouho | Shows all possible Kanji representatives. |
RKC Romaji Mode key | romaji | Toggles RKC on and off. |
Hiragana Shift key | hiragana | Becomes Hiragana shift state. |
Katakana Shift key | katakana | Becomes Katakana shift state. |
Romaji Shift key | eisu | Becomes Romaji shift state. |
The Japanese Input Method's (JIM) KKC technology is based on the fact that every Kanji character or set of Kanji characters has a phonetic sound or sounds that can be expressed by Katakana or Hiragana characters.
It is much easier to input Hiragana or Katakana characters than Kanji characters. The JIM analyzes the phonetic values of the Hiragana and Katakana characters to determine the best Kanji-character equivalent. Such phonetic analysis depends on the dictionary and tables provided to the JIM.
The JIM has the following modes that can be used to control the input processing:
Allows invocation of alphanumeric, Katakana, or Hiragana modes.
Inputs in Zenkaku (full-width) or Hankaku (half-width) mode.
Inputs Kana directly or invokes the pre-edit composing mode to input Kana with a combination of alphabetic characters. The pre-editing facility allows processing of characters before they are committed to the application.
When the keyboard-mapping mode is alphanumeric and the character size mode is Hankaku, the JIM maps keys to Romaji characters. This mode combination is known as the "English" mode. Pre-editing is not needed in English mode and cannot be invoked regardless of the RKC mode setting. The other mode combinations may initiate pre-editing and characters generated in these modes are not ASCII.
The following keys are used to perform Kana-to-Kanji conversion by the JIM.
Keysym | Keyboard Mapping |
Katakana | Katakana shift |
Eisu_toggle | Alphanumeric shift |
Hiragana | Hiragana shift |
Keysym | Character Size |
Zenkaku_Hankaku | Full-width or Half-width toggle |
Hankaku | Half-width |
Zenkaku | Full-width |
Keysym | RKC on/off |
Alt-Hiragana | Enables/Disables Romaji-to-Kana conversion |
Romaji | *The same effect |
* Keysyms unique to the manufacturer
The following keys are also used when the JIM is pre-editing a Kanji string.
Keysym | Kanji pre-edit |
Muhenkan | Non-conversion - commit Kana |
Henkan | Conversion - get next candidate |
Kanji | Same as Henkan |
BunsetsuYomi | *Moves back a phrase |
MaeKouko | *Moves to previous candidate |
LeftDouble | *Moves cursor two characters left |
RightDouble | *Moves cursor two characters right |
ErInput | *Discards the current pre-edit string |
Keysym | Auxiliary pre-edit |
Alt-Henkan | All candidates |
Touroku | Run-time registration |
ZenKouho | *All candidates (the same effect) |
KanjiBangou | *Kanji Number Input |
HenkanMenu | *Changes conversion mode |
* Keysyms unique to the manufacturer
The following keyboard mapping states are possible: Alphanumeric (Romaji), Katakana, and Hiragana. Each state is invoked by a keysym that acts as a locking shift key. The keysyms are Katakana, Eisu_toggle, and Hiragana shift.
When one of these keysyms is pressed, keyboard mapping enters the state associated with the key. This state is maintained until one of the other keysyms is pressed. The initial shift state is Eisu_toggle, which can be changed by customization.
When you invoke the Hiragana or Katakana state, each key is mapped to a phonetic character within the respective character set. For example, if you press q, a Hiragana character pronounced "ta" is produced during Hiragana shift state, a Katakana character pronounced "ta" is produced during Katakana shift state, or a Romaji q is produced during Eisu_toggle shift state. On Japanese IBM keyboards, the tops of keys show all three symbols.
Also, when keyboard mapping is in Hiragana state, the input method is automatically put into a composing pre-editing mode where each Hiragana character can be converted into a Kanji character. See Kanji Pre-edit for more information.
Some keys have two Hiragana or Katakana characters assigned. For example, the 7 key has large and small Hiragana characters both having the pronunciation "ya". These characters are not uppercase and lowercase equivalents of each other because Kanji, Hiragana, and Katakana do not have uppercase and lowercase. The small characters are used to express special phonetic sounds. These characters can be distinguished by using the shift key.
A subset of the Japanese character set is represented in both full-width and half-width. Kanji ideographic characters are usually full-width. The phonetic and ASCII characters have both full-width and half-width representations. The user controls character size by pressing the Zenkaku_Henkaku keysym, which toggles between full-width and half-width.
For users familiar with alphanumeric keyboards, it is easier to type the phonetic sounds rather than the Hiragana or Katakana characters. The JIM provides Romaji-to-Kana conversion (RKC), allowing the user to type in the phonetic sounds of Hiragana or Katakana characters on an alphanumeric keyboard.
When operating in Romaji-To-Kana conversion mode, you must follow two steps to produce Kanji characters. First, the user inputs Hiragana characters by typing their Romaji phonetic characters. In this step, you produce a Hiragana character by typing 1 to 3 Romaji alphabetic keys that compose the phonetic sound of the Hiragana character. Second, convert the Hiragana characters to Kanji characters by pressing the Henkan key. Many Kanji characters may be associated with a single phonetic phrase. The Henkan key displays the most likely Kanji candidates. Repeated pressing of the Henkan key displays all the additional candidates.
For example, when you enter the Kanji characters for the phonetic sound "k-a-n-j-i", you must do two things:
You can now press the keys that spell "kanji". As each phonetic sound is completed, a Hiragana character displays.
The Hiragana character is displayed with visual feedback to indicate that the JIM is composing in a pre-edit state. The character is underlined and shown in reverse video. This feedback facility is known as a callback. See Using Callbacks for more information.
To convert the Hiragana character within the pre-edit string to a Kanji character, press the Henkan key. The most likely candidate associated with the phonetic Hiragana sound displays. Pressing this key repeatedly shows other candidates.
During the composition process, the pre-edit string is partitioned into segments that can be considered Kanji words. After a string of kana characters is converted into a candidate, it is treated as one of these convertible segments. While the pre-edit string is displayed, the JIM uses the cursor key and other keys to manipulate the string.
To commit the pre-edit string to the program, the user presses the Enter key. In this case, the Enter key code itself is not sent to the program, only the string.
The Muhenkan keysym can also be used to turn off pre-edit and commit the Hiragana or Katakana character directly to the program.
The following table depicts the shift state transition and the interaction of the RKC mode key with the shift states.
Character Encoding | Code Points | Description | Count |
000xxxxx | 00-1F | Controls | 32 |
00100000 | 20 | Space | 1 |
0xxxxxxx | 21-7E | 7-bit ASCII | 94 |
01111111 | 7F | Delete | 1 |
10000000 | 80 | Undefined | 1 |
100xxxxx 01xxxxxx | [81-9F] [40-7E] | Double byte | 1953 |
100xxxxx 1xxxxxxx | [81-9F] [80-FC] | Double byte | 3844 |
10100000 | A0 | Undefined | 1 |
1xxxxxxx | A1-DF | 8-bit single byte | 63 |
111xxxxx 01xxxxxx | [E0-FC] [40-7E] | Double byte | 1827 |
111xxxxx 1xxxxxxx | [E0-FC] [80-FC] | Double byte | 3596 |
11111101 | FD | Undefined | 1 |
11111110 | FE | Undefined | 1 |
11111111 | FF | All ones | 1 |
The JIM has the following types of auxiliary areas:
A Kana-to-Kanji conversion operation on a string of Hiragana or Katakana characters can yield from one to a hundred Kanji candidates. At worst, you would have to press the conversion key more than a hundred times to get the correct Kanji character.
In such cases, it is more convenient to find the correct character by requesting the All Candidates menu with the ZenKouho or the Alt-Henkan keysym. This menu displays if the current target (a Kanji word that the cursor is pointing to in the pre-edit area) has several alternative candidates associated with it. The menu contains multiple candidates for selection. The All Candidates menu disappears when the Reset keysym is pressed, the Enter key is pressed, or a candidate is selected.
A Kanji Number Input dialog prompts the user to select the Kanji character by entering 3 to 5 digits. The digits represent the code of the character. Online dictionaries allow a user to search for the code. The ordering formats for these dictionaries vary. For example, one dictionary lists codes by phonetic sound. Another dictionary orders codes by the number of strokes used to compose the character. The KanjiBangou keysym invokes this menu. The menu is terminated with either the Reset or Return keysym.
The HenkanMenu keysym invokes the Conversion Mode menu. Four items are displayed for selection. The most important items are the word-conversion mode and phrase-conversion mode. Make a selection by choosing a number and pressing the Return keysym. This menu is terminated when either a selection is made or the Reset keysym is pressed.
A run-time registration dialog prompts the user to input a Kana string and a Kanji string for registering the mapping of the strings in the user dictionary. After the pair is registered, the JIM can use it as a conversion candidate. The menu is terminated with the Escape or Reset keysym.
The presentation of menus depends on the interface environment in which the JIM is operating. For example, some interfaces support scrolling menus that use the Page Down and Page Up keys.
The following keymaps are supported by the JIM:
The JIM uses the keysyms in the XK_KATAKANA, XK_LATIN1, and XK_MISCELLANY groups.
The following reserved keysyms are unique to the input method of this system:
XK_BunsetsuYomi | 0x1800ff05 | Back a phrase to Yomi |
XK_MaeKouho | 0x1800ff04 | Previous candidate |
XK_ZenKouho | 0x1800ff01 | All candidates. |
XK_KanjiBangou | 0x1800ff02 | Kanji number input. |
XK_HenkanMenu | 0x1800ff03 | Changes conversion mode. |
XK_LeftDouble | 0x1800ff06 | Moves cursor two characters left. |
XK_RightDouble | 0x1800ff07 | Moves cursor two characters right. |
XK_LeftPhrase | 0x1800ff08 | Reserved for future use. |
XK_RightPhrase | 0x1800ff09 | Reserved for future use. |
XK_ErInput | 0x1800ff0a | Discards the current pre-edit string |
XK_Resetreset | 0x1800ff0b | Reset |
The following modifiers are supported by the JIM:
ShiftMask | 0x01 |
LockMask | 0x02 |
ControlMask | 0x04 |
Mod1Mask (Left-Alt) | 0x08 |
Mod2Mask (Right-Alt) | 0x10 |
The following internal modifiers are supported by the JIM:
Kana | 0x20 |
Romaji | 0x40 |
Understanding the ISO Code Sets (ISO Code Sets) in AIX 5L Version 5.2 Kernel Extensions and Device Support Programming Concepts.