[ Previous | Next | Table of Contents | Index | Library Home |
Legal |
Search ]
General Programming Concepts: Writing and Debugging Programs
The NLS Quick Reference provides
a place to get started internationalizing programs. The following
sections offer advice and a practical guide through the NLS
documentation:
The following list presents a set
of NLS guiding principles and advice. The intention is to prevent the
occurrence of common errors when internationalizing programs. See Chapter 16, National Language Support for more information about NLS.
- DO externalize any user
and error messages. We recommend the use of message catalogs. X
applications may use resource files to externalize messages for each
locale. See the Message Facility Overview for Programming for more information.
- DO use standard X/Open,
ISO/ANSI C, and POSIX functions to maximize portability. See "NLS
Subroutines Overview" (National Language Support Subroutines Overview) for more information.
- DO use the font set
specification in order to be code-set independent in X applications.
- DO use Xm (Motif)
library widgets for building bidirectional and character shaping
applicaitons. See "Layout (Bidirectional) Support in
Xm (Motif) Library" in AIX 5L Version 5.1 AIXwindows
Programming Guide for general information. Refer to the
XmText or XmTextField widgets for support of input and
output of bidirectional and shaping characteristics.
- DON'T assume the
size of all characters to be 8 bits, or 1 byte. Characters may be 1, 2,
3, 4 or more bytes. See Multibyte Code and Wide Character Code Conversion Subroutines and the Code Set Overview for more information.
- DON'T assume the
encoding of any code set. See the Code Set Overview for more information.
- DON'T hard code
names of code sets, locales, or fonts because it may impact
portability. See Chapter 16, National Language Support for more information.
- DON'T use p++ to
increment a pointer in a multibyte string. Use the mblen subroutine to determine the number of
bytes that compose a character.
- DON'T assume any
particular physical keyboard is in use. Use an input method based on
the locale setting to handle keyboard input. See the Input Method Overview for more information.
- DON'T define your
own converter unless absolutely necessary. See the Converters Overview for Programming for more information.
- DON'T assume that
the char data type is either signed or unsigned. This is
platform-specific. If the particular system that is used defines
char to be signed, comparisons with full 8-bit quantity
will yield incorrect results. As all the 8-bits are used in encoding a
character, be sure to declare char as unsigned char
wherever necessary. Also, note that if a signed char value
is used to index an array, it may yield incorrect results. To make
programs portable, define 8-bit characters as unsigned char.
- DON'T use the
layout subroutines in the libi18n.a library unless the
application is doing presentation types of services. Most applications
just deal with logically ordered text. See Introducing Layout Library Subroutines for more information.
The National Language Support
(NLS) Checklist provides a way to analyze a program for NLS
dependencies. By going through this list, one can determine what, if
any, NLS functions must be considered. This is useful for both
programming and testing. If you identify a set of NLS items that a
program depends on, a test strategy can be developed. This facilitates
a common approach to testing all programs.
All major NLS considerations have
been identified. However, this list is not all-encompassing.
There may be other NLS questions that are not listed. See Chapter 16, National Language Support for more information about NLS. See National Language Support Do's and Don'ts for a brief list of NLS advice.
- Does the program display
translatable messages to the user, either directly or indirectly? An example
of indirect messages are those that are stored in libraries.
If yes:
- Does the program compare
text strings?
If yes:
- Are the strings compared
to check equality only?
If yes:
- Are the strings compared
to see which one sorts before the other, as defined in the current locale?
If yes:
- Does the program parse
path names of files?
If yes:
- If looking for /
(slash), use the strchr subroutine.
- If looking for
characters, be aware that the file names can include multibyte
characters. In such cases, invoke the setlocale subroutine in the following manner and
then use appropriate search subroutines:
setlocale(LC_ALL, "")
- Does the program use
system names, such as node names, user names, printer names, and queue names?
If yes:
- System names can have
multibyte characters.
- To identify a multibyte
character, first invoke the setlocale
subroutine in the following manner and then use appropriate subroutines in the
library.
setlocale(LC_ALL, "")
- Does the program use
character class properties, such as uppercase, lowercase, and alphabetic?
If yes:
- Invoke the setlocale subroutine in the following
manner:
setlocale(LC_ALL, "")
- Do not make assumptions
about character properties. Always use system subroutines to determine
character properties.
- Are the characters
restricted to single-byte code sets?
If yes:
- Does the program convert
the case (upper or lower) of characters?
If yes:
- Invoke the setlocale subroutine in the following
manner:
setlocale(LC_ALL, "")
- Are the characters
restricted to single-byte code sets?
If yes:
- Use these conv subroutines: _tolower,
_toupper, tolower, or toupper.
If not, the characters may be multibyte characters:
- Does the program keep
track of cursor movement on a tty terminal?
If yes:
- Invoke the setlocale subroutine in the following
manner:
setlocale(LC_ALL, "")
- You may need to
determine the display column width of characters. Use the wcwidth or wcswidth subroutine.
- Does the program perform
character I/O?
If yes:
- Invoke the setlocale subroutine in the following
manner:
setlocale(LC_ALL, "")
- Are the characters
restricted to single-byte code sets?
If yes:
- Use following subroutine
families:
If not:
- Use following subroutine
families:
- Does the program step
through an array of characters?
If yes:
- Is the array limited to
single-byte characters only?
If yes:
- Does not require
setlocale(LC_ALL, "")
- If p is the pointer to
this array of single-byte characters, step through this array using
p++.
If not:
- Does the program need to
know the maximum number of bytes used to encode a character within the code
set?
If yes:
- Does the program format
date or time numeric quantities?
If yes:
- Does the program format
numeric quantities?
If yes:
- Invoke the setlocale subroutine in the following
manner:
setlocale(LC_ALL, "")
- Use the nl_langinfo or localeconv subroutine to obtain the
locale-specific information.
- Use the following pair
of subroutines, as needed: printf, scanf.
- Does the program format
monetary quantities?
If yes:
- Does the program search
for strings or locate characters?
If yes:
- Are you looking for
single-byte characters in single-byte text?
- Does not require
setlocale(LC_ALL, "")
- Use standard
libc string subroutines such as the strchr subroutine.
- Are you looking for
characters in the range 0x00-0x3F (the unique code-point range)
?
- Are you looking for
characters in the range 0x00-0xFF?
- Invoke the setlocale subroutine in the following
manner:
setlocale(LC_ALL, "")
- Two methods are
available:
Use the mblen subroutine to skip multibyte
characters. Then, on encountering single-byte characters, check for
equality. See checklist item 2.
OR
Convert the search character and
the searched string to wide character form, and then use wide character search
subroutines. See "Multibyte Code and Wide Character Code Conversion
Subroutines" (Multibyte Code and Wide Character Code Conversion Subroutines) and Wide Character String Search Subroutines for more information.
- Does the program perform
regular-expression pattern matching?
If yes:
- Does the program ask the
user for affirmative/negative responses?
If yes:
- Does the program use
special box-drawing characters?
If yes:
- Do not use code
set-specific box-drawing characters like those in IBM-850.
- Instead use the
box-drawing characters and attributes specified in the terminfo file.
- Does the program perform
culture-specific or locale-specific processing that is not addressed here?
If yes:
- Externalize the
culture-specific modules. Do not make them part of the executable
program.
- Load the modules at run
time using subroutines provided by the system, such as the load subroutine.
- If the system does not
provide such facilities, link them statically but provide them in a modular
fashion.
The remaining checklist items are
specific to the AIXwindows systems.
- Does your client use
labels, buttons, or other output-only widgets to display translatable
messages?
If yes:
- Does your client use X
resource files to define the text of labels, buttons, or text widgets?
If yes:
- Put all resources that
need translation in one place.
- Consider using message
catalogs for the text strings. See the Message Facility Overview for Programming for more information.
- Do not use translated
color names, since color names are restricted to one encoding. The only
portable names are encoded in the portable character set.
- Put language-specific
resource files in /usr/lib/X11/%L/app-defaults/%N, where
%L is the name of the locale, such as fr_FR, and %N is
the name of the client.
- Is keyboard input
localized by language?
If yes:
- Invoke the
*XtSetLanguageProc subroutine in the following
manner:
XtSetLanguageProc(NULL, NULL, NULL);
- Use the
XmText or XmTextField widgets for all text input.
Some of the XmText
widgets' arguments are defined in terms of character length instead of
byte length. The cursor position is maintained in character position,
not byte position.
- Are you using the
XmDrawingArea widget to do localized input?
- Use the input method
subroutines to do input processing in different languages. See the Input Method Overview and the IMAuxDraw Callback
subroutine for more information.
- Does your client present
lists or labels consisting of localized text from user files rather than from
X resource files?
If yes:
- Does your program do
any presentation operations (Xlib drawing, printing, formatting, or editing)
on bidirectional text?
If yes:
- Use the
XmText or XmTextField in the Xm (Motif) library.
These widgets are enabled for bidirectional text. See "Layout (Bidirectional) Support in Xm (Motif) Library" in
AIX 5L Version 5.1 AIXwindows Programming Guide for more
information.
- If the Xm library can
not be used, use the layout subroutines to perform any re-ordering and shaping
on the text. See Introducing Layout Library Subroutines for more information.
- Store and communicate
the text in the implicit (logical) form. Some utilities (for example,
aixterm) support the visual form of bidirectional text, but most
NLS subroutines can not process the visual form of bidirectional text.
If the response to all the above
items is no, then the program probably has no NLS dependencies. In this
case, you may not need the locale-setting subroutine setlocale and the catalog facility subroutines catopen and catgets.
The following are suggestions on
how to make messages meaningful and concise:
- Plan for the translation
of all messages, including messages that are displayed on panels.
- Externalize
messages.
- Provide default
messages.
- Make each message in a
message source file be a complete entity. Building a message by
concatenating parts
together makes translation difficult.
- Use the $len
directive in the message source file to control the maximum display length of
the message text. (The $len directive is specific to the
Message Facility.)
- Allow sufficient space
for translated messages to be displayed. Translated messages often
occupy more display columns than the original message text. In general,
allow about 20% to 30% more space for translated messages, but in some cases
you may need to allow 100% more space for translated messages.
- Use symbolic identifiers
to specify the set number and message number. Programs should refer to
set numbers and message numbers by their symbolic identifiers, not by their
actual numbers. (The use of symbolic identifiers is specific to the
Message Facility.)
- Facilitate the
reordering of sentence clauses by numbering the %s
variables. This allows the translator to reorder the clauses if
needed. For example, if a program needs to display the English
message: The file %s is referenced in
%s, a program may supply the two strings as follows:
printf(message_pointer, name1, name2)
The English message numbers the %s variables as
follows:
The file %1$s is referenced in %2$s\n
The translated equivalent of this message may be:
%2$s contains a reference to file %1$s\n
- Do not use
sys_errlist[errno] to obtain an error message. This defeats
the purpose of externalizing messages. The sys_errlist[] is
an array of error messages provided only in the English language. Use
strerror(errno)
, as it obtains messages from catalogs.
- Do not use
sys_siglist[signo] to obtain an error message. This defeats the
purpose of externalizing messages. The sys_siglist[] is an
array of error messages provided only in the English language. Use
psignal()
, as it obtains messages from catalogs.
- Use the message comments
facility to aid in the maintenance and translation of messages.
- In general, create
separate message source files and catalogs for messages that apply to each
command or utility.
- Show the command syntax in the
usage statement. For example, a possible usage statement for the
rm command is:
Usage: rm [-firRe] [--] File ...
- Capitalize the first letter of
such words as File, Directory, String, and
Number in usage statement messages.
- Do not abbreviate parameters
on the command line. For example, Num spelled out
as Number can be more easily translated.
- Use only the following
delimiters in usage statement messages:
[]
| Encloses an optional parameter.
|
{}
| Encloses multiple parameters, one of which is required.
|
|
| Seperates parameters that cannot both be chosen. For example,
[a|b] indicates that you can choose a,
b , or neither a nor b ; and
{a|b} indicates that you must choose a or b
.
|
...
| Follows a parameter that can be repeated on the command line. Note
that there is a space before the ellipsis.
|
-
| Indicates standard input.
|
- Do not use any delimiters for
a required parameter that is the only choice. For example:
banner String
- Put a space character between
flags that must be separated on the command line. For example:
unget [-n] [-rSID] [-s] {File|-}
- Do not separate flags that
can be used together on the command line. For example:
wc [-cwl] {File ...|-}
- Put flags in alphabetical
order when the order of the flags on the command line does not make a
difference. Put lowercase flags before uppercase flags. For
example:
get -aAijlmM
- Use your best judgment to
determine where you should end lines in the usage statement message.
The following example shows a lengthy usage statement message:
Usage: get [-e|-k] [-cCutoff] [-iList] [-rSID] [-wString] [-xList] [-b] [-gmnpst] [-l[p]] File ...
Clear writing aids in message
translation. The following guidelines on the writing style of messages
include terminology, punctuation, mood, voice, tense, capitalization, format,
and other usage questions.
- Write concise
messages. One-sentence messages are preferable.
- Use complete-sentence
format.
- Add articles (a, an, the)
when necessary to eliminate ambiguity.
- Capitalize the first word of
the sentence, and use a period at the end of the sentence.
- Use the present tense.
Do not use future tense in a message. For example, use the
sentence:
The cal command displays a calendar.
Instead of:
The cal command will display a calendar.
- Do not use the first person
(I or we) in messages.
- Avoid using the second person
(you) except in help and interactive text.
- Use active voice. The
following example shows how a message written in passive voice can be turned
into an active voice message.
Passive: Month and year must be entered as numbers.
Active: Enter month and year as numbers.
- Use the imperative mood
(command phrase) and active verbs such as specify, use, check, choose, and
wait.
- State messages in a
positive tone. The following example shows a negative message made more
positive.
Negative: Don't use the f option more than once.
Positive: Use the -f flag only once.
- Use words only in the
grammatical categories shown in a dictionary. If a word is shown only
as a noun, do not use it as a verb. For example, do not
solution a problem or architect a system.
- Do not use prefixes or
suffixes. Translators may not know what words beginning with
re-, un-, in-, or non- mean, and
the translations of messages that use prefixes or suffixes may not have the
meaning you intended. Exceptions to this rule occur when the prefix is
an integral part of a commonly used word. For example, the words
previous and premature are acceptable; the word
nonexistent is not acceptable.
- Do not use parentheses to
show singular or plural, as in error(s), which cannot be
translated. If you must show singular and plural, write error or
errors. You may also be able to revise the code so that different
messages are issued depending on whether the singular or plural of a word is
required.
- Do not use
contractions.
- Do not use quotation marks,
both single and double quotation marks. For example, do not use
quotation marks around variables such as %s,
%c, and %d or around
commands. Users may interpret the quotation marks literally.
- Do not hyphenate words
at ends of lines.
- Do not use the standard
highlighting guidelines in messages, and do not substitute initial or all caps
for other highlighting practices. (Standard highlighting includes such
guidelines as bold for commands, subroutines, and files; italics for
variables and parameters; typewriter or courier for examples and
displayed text.)
- Do not use the and/or
construction. This construction does not exist in other
languages. Usually it is better to say or to indicate that
it is not necessary to do both.
- Use the 24-hour clock.
Do not use a.m. or p.m. to specify
time. For example, write 1:00 p.m.
as 1300.
- Avoid acronyms. Only
use acronyms that are better known to your audience than their spelled-out
version. To make a plural of an acronym, add a lowercase s without an
apostrophe. Verify that the acronym is not a trademark before using
it.
- Do not construct
messages from clauses. Use flags or other means within the program to
pass on information so that a complete message may be issued at the proper
time.
- Do not use hard-coded
text as a variable for a %s string in a message.
- End the last line of the
message with \n (indicating a new line). This
applies to one-line messages also.
- Begin the second and remaining
lines of a message with \t (indicating a tab).
- End all other lines with
\n\ (indicating a new line).
- Force a newline on word
boundaries where needed so that acceptable message strings display. The
printf subroutine, which often is used to display the message text,
disregards word boundaries and wraps text whenever necessary, sometimes
splitting a word in the middle.
- If, for some reason, the
message should not end with a newline character, leave writers a comment to
that effect.
- Precede each message with the
name of the command that called the message, followed by a colon. The
following example is a message containing a command name:
OPIE "foo: Opening the file."
- Tell the user to
Press the ------ key to select a key on the
keyboard, including the specific key to press. For example:
Press the Ctrl-D key
- Do not tell the user to
Try again later, unless the system is overloaded. The need
to try again should be obvious from the message.
- Use the word "parameter"
to describe text on the command line, the word "value" to indicate numeric
data, and the words "command string" to describe the command with its
parameters.
- Do not use commas to set
off the one-thousandth place in values. For example, use
1000 instead of 1,000.
- If a message must be set
off with an * (asterisk), use two asterisks at the beginning of the
message and two at asterisks at the end of the message. For
example:
** Total **
- Use the words "log in"
and "log off" as verbs. For example:
Log in to the system; enter the data; then log off.
- Use the words "user
name," "group name," and "login" as nouns. For example:
The user is sam.
The group name is staff.
The login directory is /u/sam.
- Do not use the word
"superuser." Note that the root user may not have all
privileges.
- Use the following
frequently occurring standard messages where applicable:
Preferred Standard Message
| Less Desirable Message
|
Cannot find or open the
file.
| Can't open filename.
|
Cannot find or access the
file.
| Can't access
|
The syntax of a parameter
is not valid.
| syntax error
|
Chapter 16, National Language Support . Locale Overview for System
Management, How to Change the Language Environment,
and How to Change Your Locale in AIX 5L Version
5.1 System Management Guide: Operating System and
Devices.
Code Set Overview
National Language Support Overview for System
Management in AIX 5L Version 5.1 System Management
Guide: Operating System and Devices.
The chlang command, dspcat command, dspmsg command, gencat command, localedef , lslpp command, mkcatdefs command,
runcat command in AIX 5L Version 5.1 Commands
Reference.
Code Set Strategy in AIX 5L Version 5.1 Kernel Extensions and
Device Support Programming Concepts.
Character Set Description (charmap) source file
format, Locale Definition source file format in
AIX 5L Version 5.1 Files Reference.
The environment file in AIX 5L
Version 5.1 Files Reference.
[ Previous | Next | Table of Contents | Index |
Library Home |
Legal |
Search ]