[ Previous |
Next |
Contents |
Glossary |
Home |
Search ]
AIX Version 4.3 General Programming Concepts: Writing and Debugging Programs
Layout (Bidirectional Text and Character Shaping) Overview
This section contains the following information:
Bidirectional (BIDI) text results when texts of different direction orientation appear together. For example, English text is read from left to right. Arabic and Hebrew texts are read from right to left. If both English and Hebrew texts appear on the same line, the text is bidirectional.
Write bidirectional text according to the following guidelines:
- Arabic and Hebrew words are written from right to left. (A character string is considered a word for the purposes of sequencing in an alphanumeric environment.)
- Numbers and English quotations are written from left to right.
- Digits and their punctuation are written marks from left to right.
Bidirectional script is read from right to left and from top to bottom.
If the embedded text is contained in one line, the text is written from left to right and embedded in the bidirectional text. However, if the embedded text is split between two or more lines, the correct order must be maintained in the left to right portions to allow top to bottom reading.
For example, right-to-left text embedded in left-to-right text that is contained in one line is written as follows:
THERE IS txet lanoitceridib deddebme IN THIS SENTENCE.
Right-to-left text embedded in left-to-right text that is split between two lines is written as follows:
THERE IS senil owt neewteb tilps si taht txet lanoitceridib deddebme IN THIS SENTENCE.
Both texts maintain readability even though the embedded text is split.
Data Streams
Bidirectional text environments use the following data streams:
Visual Data Streams |
The system organizes characters in the sequence in which they are presented on the screen.
If a visual data stream is presented from left to right, the first character of the data stream is on the left side of the viewport (screen, window, line, field, and so on). If the same data stream is presented on a right-to-left viewport, the initial character of the data stream is on the right.
If a language of opposite writing orientation is embedded in the visual data stream, the sequence of each text is preserved when the viewport orientation is reversed. For example, (the lowercase text represents bidirectional text) if the keystroke order is :
THERE IS bidirectional text IN THIS SENTENCE.
then the visual data stream is:
THERE IS txet lanoitceridib IN THIS SENTENCE.
This visual data stream's presentation on a left-to-right viewport is left-justified, as follows:
THERE IS txet lanoitceridib IN THIS SENTENCE.
-------> <----------------- ---------------->
The arrows indicate reading direction.
If you change the viewport orientation to right-to-left, the visual data stream is reversed, right-justified, and unreadable, as follows:
.ECNETNES SIHT NI bidirectional text SI EREHT
<---------------- -----------------> <-------
Thus, if English text is embedded in Arabic or Hebrew text, both texts are in proper reading order only on a left-to-right viewport. The same is true for Arabic or Hebrew embedded in English. Reversing the viewport orientation makes both texts unreadable.
|
Logical Data Streams |
The system organizes characters in a readable sequence. The bidirectional presentation-management functions arrange text strings in a readable order.
If a logical data stream is presented on a left-to-right viewport, the initial character of the data stream is presented on the left side. If the same data stream is presented on a right-to-left viewport, the initial character of the data stream is presented on the right side, though it is still presented in a readable order.
If a language of opposite writing orientation is embedded in the logical data stream, the orientations of each text are preserved by the bidirectional presentation-management functions. For example, if the keystroke order is:
THERE IS bidirectional text IN THIS SENTENCE.
then the logical data stream is the same. For example:
THERE IS bidirectional text IN THIS SENTENCE.
This logical data stream's presentation on a left to right viewport (left-justified) is as follows:
THERE IS txet lanoitceridib IN THIS SENTENCE.
-------> <----------------- ---------------->
The logical data stream's presentation on a right to left viewport (right-justified) is as follows:
IN THIS SENTENCE. txet lanoitceridib THERE IS
----------------> <----------------- ------->
The logical data stream is readable on both viewport orientations.
|
Cursor Movement
Cursor movement on a screen containing bidirectional text is as follows:
Visual |
The cursor moves from its current position left or right to the next character, or up or down to the next row. For example, if the cursor is located at the end of the first left-to-right part of a mixed sentence:
THERE IS_txet lanoitceridib IN THIS SENTENCE.
then, moving the cursor visually to the right causes it to move one character to the right, as follows:
THERE IS txet lanoitceridib IN THIS SENTENCE.
The cursor moves without regard to the contents of the text.
|
Logical |
The cursor moves from its current position to the next or previous character in the data stream. The character may be adjacent to the cursor's position, elsewhere in the same line, or on another line on the screen. Logical cursor movement requires scanning the data stream to find the next logical character. For example, if the cursor is located at the end of the first left-to- right part of a mixed sentence:
THERE IS_txet lanoitceridib IN THIS SENTENCE.
then, moving the cursor logically to the next character causes the data stream to be scanned to find the next logical character. The cursor moves to the next logical part of the sentence, as follows:
THERE IS txet lanoitceridib_IN THIS SENTENCE.
The cursor moves according to content.
|
Character Shaping
Character shaping occurs when the shape of a character is dependent on its position in a line of text. In some languages, such as Arabic, characters have different shapes depending on their position in a string and on the surrounding characters.
The following characteristics determine character shaping in Arabic script:
- The written language has no equivalent to capital letters.
- The characters have different shapes, depending on their position in a string and on the surrounding characters.
- The written language is cursive. Most characters of a word are connected, as in English handwriting.
- Joined characters can form nonspacing characters. Additionally, a character can have a vowel or diacritic mark written over or under it.
- Characters can vary in length, resulting in an output of two coded shapes.
Methods of Character Shaping
Implement character shaping separately from other system components. However, character shaping should be accessible as a utility by other system components. The system may use character shaping in the following ways:
- As the user enters data into the computer, the system uses character shaping to shape the characters. The system stores these characters in their shaped format.
This method avoids the need to use character shaping every time these characters are displayed. This method is meant for static data such as menus and help. This method requires preprocessing for proper sorting, searching, or indexing of the characters.
The characters may need reshaping after processing for proper presentation.
- As the user enters data into the computer, the system stores the characters in their unshaped format.
This method allows for sorting, searching or indexing of the characters. However, the system must use character shaping every time the characters are displayed.
Base shapes are isolated shapes that were not generated by character shaping. Use base shapes during editing, searching for character strings, or other text operations. Use shaping only when the text is displayed or printed. If characters are stored in their shaped form, the system must deshape them before sorting, collating, searching, or indexing. Character shapes that are not shape determined according to their position in a string are needed for specific character-handling applications as well as for communication with different coding environments.
Contextual Character Shaping
In general, contextual character shaping is the selection of the required shape of a character in a given font depending on its position in a word and its surrounding characters. The following shapes are possible:
Isolated |
A character that is connected to neither a preceding nor succeeding character. |
Final |
A character that is connected to a preceding character but not with a succeeding character. |
Initial |
A character connected to a succeeding character but not with a preceding character. |
Middle |
A character connected to both a preceding and succeeding character. |
A character may also have any of the following characteristics:
- Connecting to a preceding character.
- Connecting to a succeeding character.
- Allowing surrounding characters' connections to pass through it.
Acronyms, part numbers, and graphic characters do not need contextual character shaping. To properly enter these characters, turn off the contextual character shaping and use a specific keyboard interface for exact selection of the desired shape. Tag these characters by field, line, or control character for later presentation.
Introducing Layout Library Subroutines
For information on the layout library, please see website:
www.opengroup.org
Or order "Portable Layout Services: Context-dependent and Directional Text"
Book# C616 ISBN 1-85912-142-X January 1997
From:
The Open Group,
Publications Department,
PO Box 96,
Witney,
Oxon OX8 6PG,
England
Tel: +44 (0)1993 708731, Fax: +44 (0)1993 708732
[ Previous |
Next |
Contents |
Glossary |
Home |
Search ]