National Language Support Guide and Reference

Input Methods

For an application to run in the international environment for which National Language Support (NLS) provides a base, input methods are needed. The Input Method is an application programming interface (API) that allows you to develop applications independent of a particular language, keyboard, or code set. Each type of input method has the following features:

Keymaps	Set of input method keymaps (imkeymaps) that works with the input method and determines the supported locales.
Keysyms	Set of key symbols (keysyms) that the input method can handle.
Modifiers	Set of modifiers or states, each having a mask value, that the input method supports.

Input Method Introduction

An input method is a set of functions that translates key strokes into character strings in the code set specified by your locale. Input method functions include locale-specific input processing and keyboard controls (for example, Ctrl, Alt, Shift, Lock, and Alt-Graphic). The input method allows various types of input, but only keyboard events are dealt with in this chapter.

Your locale determines which input method should be loaded, how the input method runs, and which devices are used. The input method then defines states and their outcome.

When the input method translates a keystroke into a character string, the translation process takes into account the keyboard and the code set you are using. You can write your own input method if you do not have a standard keyboard or if you customize your code set.

Many languages use a small set of symbols or letters to form words. To enter text with a keyboard, you press keys that correspond to symbols of the alphabet. When a character in your alphabet does not exist on the keyboard, you must press a combination of keys. Input methods provide algorithms that allow you to compose such characters.

Some languages use an ideographic writing system. They use a unique symbol, rather than a group of letters, to represent a word. For instance, the character sets used in China, Japan, Korea, and Taiwan have more than 5,000 characters. Consequently, more than one byte must be used to represent a character. Moreover, a single keyboard cannot include all the required ideographic symbols. You need input methods that can compose multibyte characters.

The /usr/lib/nls/loc directory contains the input methods installed on your system. You can list the contents of this directory to determine which input methods are available to you. Input method file names have the format Language_Territory.im. For example, the fr_BE.im file is the input method file for the French language as used in Belgium.

Through a well-structured protocol, input methods allow applications to support different input without using locale-specific input processing.

In AIX, the input method is provided in the aixterm. When characters typed from the AIXwindows interface reach the server, the characters are in the form of key codes. A table provided in the client converts key codes into keysyms, a predefined set of codes. Any key code generated by a keyboard should have a keysym. These keysyms are maintained and allocated by the MIT X Consortium. The keysyms are passed to the client aixterm terminal emulator. In the aixterm, the input keysyms are converted into file codes by the input method and are then sent to the application. The X server is designed to work with the display adapter provided in the system hardware. The X server communicates with the X client through sockets. Thus, the server and the client can reside on different systems in a network, provided they can communicate with each other. The data from the keyboard enters the X server, and from the server, it is passed to the terminal emulator. The terminal emulator passes the data to the application. When data comes from applications to the display device, it passes through the terminal emulator by sockets to the server and from the server to the display device.

Input Method Names

The set of input methods available depends on which locales have been installed and what input methods those locales provide. The name of the input method usually corresponds to the locale. For example, the Greek Input Method is named el_GR, which is the same as the locale for the Greek language spoken in Greece.

When there is more than one input method for a locale, any secondary input method is identified by a modifier that is part of the locale name. For example, the French locale, as spoken in Canada, has three input methods, the default and two alternative methods. The input method names are:

fr_CA	Default input method
fr_CA@im=alt	Alternative input method
fr_CA.im__64	64-bit input method

The fr portion of the locale represents the language name (French), and the CA represents the territory name (Canada). The @im=alt string is the modifier portion of the locale that is used to identify the alternative input method. All modifier strings are identified by the format @im=Modifier.

Because the input method is a loadable object module, a different object is required when running in the 64-bit environment. In the 64-bit environment, the input method library automatically appends __64 to the name when searching for the input method. In the preceding example, the name of the input method would be fr_CA.im__64.

It is possible to name input methods without using the locale name. Because the libIM library does not restrict names to locale names, the calling application must ensure that the name passed to libIM can be found. However, applications should request only modifier strings of the form @im=Modifier and that the user's request be concatenated with the return string from the setlocale (LC_CTYPE,NULL) subroutine.

Input Method Areas

Complex input methods require direct dialog with users. For example, the Japanese Input Method may need to show a menu of candidate strings based on the phonetic matches of the keys that you enter. The feedback of the key strokes appears in one or more areas on the display. The input method areas are as follows:

Status

Text data and bitmaps can appear in the Status area. The Status area is an extension of the light-emitting diodes (LEDs) on the keyboard.

Pre-edit

Intermediate text appears in the Pre-edit area for languages that compose before the client handles the data.

A common feature of input methods is that you press a combination of keys to represent a single character or set of characters. This process of composing characters from keystrokes is called pre-editing.

Auxiliary

Menus and dialogs that allow you to customize the input method appear in the Auxiliary area. You can have multiple Auxiliary areas managed by the input method and independent of the client.

Management for input method areas is based on the division of responsibility between the application (or toolkit) and the input method. The divisions of responsibility are as follows:

Applications are responsible for the size and position of the input method area.
Input methods are responsible for the contents of the input area. The input method area cannot suggest a placement.

Input Method Command

An Input Method is a set of subroutines that translate key strokes into character strings in the code set specified by a locale. Input Method subroutines include logic for locale-specific input processing and keyboard controls (Ctrl, Alt, Shift, Lock, Alt Graphic). The following command allows for the customizing of input method mapping for the use of input method subroutines:

keycomp: Compiles a keyboard mapping file into an input method keymap file.