[ Previous | Next | Table of Contents | Index | Library Home |
Legal |
Search ]
Performance Management Guide
It is possible to write a slow, multilingual application program if the
programmer is unaware of some constraints on the design of multibyte character
sets that allow many programs to run efficiently in a multibyte locale with
little use of internationalization functions. For example:
- In all code sets supported by IBM, the character codes 0x00 through 0x3F
are unique and encode the ASCII standard characters. Being unique means
that these bit combinations never appear as one of the bytes of a multibyte
character. Because the null character is part of this set, the
strlen(), strcpy(), and strcat() functions
work on multibyte as well as single-byte strings. The programmer must
remember that the value returned by strlen() is the number of bytes
in the string, not the number of characters.
- Similarly, the standard string function strchr(foostr,
'/') works correctly in all locales, because the / (slash) is
part of the unique code-point range. In fact, most of the standard
delimiters are in the 0x00 to 0x3F range, so most parsing can be accomplished
without recourse to internationalization functions or translation to
wchar_t form.
- Comparisons between strings fall into two classes: equal and
unequal. Use the standard strcmp() function to perform
comparisons. When we write
if (strcmp(foostr,"a rose") == 0)
we are not looking for "a rose" by any other name; we are
looking for that set of bits only. If foostr contains
"a rosE" no match is found.
- Unequal comparisons occur when we are attempting to arrange strings in the
locale-defined collation sequence. In that case, we would use
if (strcoll(foostr,barstr) > 0)
and pay the performance cost of obtaining the collation information about
each character.
- When a program is executed, it always starts in the C locale. If it
will use one or more internationalization functions, including accessing
message catalogs, it must execute:
setlocale(LC_ALL, "");
to switch to the locale of its parent process before calling any
internationalization function.
[ Previous | Next | Table of Contents | Index |
Library Home |
Legal |
Search ]