[ Bottom of Page | Previous Page | Next Page | Contents | Index | Library Home |
Legal |
Search ]
Performance Management Guide
Some Simplifying Rules
It is possible to write a slow, multilingual application program if the
programmer is unaware of some constraints on the design of multibyte character
sets that allow many programs to run efficiently in a multibyte locale with
little use of internationalization functions. For example:
- In all code sets supported by IBM, the character codes 0x00 through 0x3F
are unique and encode the ASCII standard characters. Being unique means that
these bit combinations never appear as one of the bytes of a multibyte character.
Because the null character is part of this set, the strlen(), strcpy(), and strcat()
functions work on multibyte as well as single-byte strings. The programmer
must remember that the value returned by strlen() is
the number of bytes in the string, not the number of characters.
- Similarly, the standard string function strchr(foostr,
'/') works correctly in all locales, because the / (slash) is part of
the unique code-point range. In fact, most of the standard delimiters are
in the 0x00 to 0x3F range, so most parsing can be accomplished without recourse
to internationalization functions or translation to wchar_t form.
- Comparisons between strings fall into two classes: equal and unequal.
Use the standard strcmp() function to perform comparisons.
When you write
if (strcmp(foostr,"a rose") == 0)
you are not
looking for "a rose" by any other name; you are
looking for that set of bits only. If foostr contains "a rosE" no match is found.
- Unequal comparisons occur when you are attempting to arrange strings in
the locale-defined collation sequence. In that case, you would use
if (strcoll(foostr,barstr) > 0)
and pay the performance cost of obtaining the collation information
about each character.
- When a program is executed, it always starts in the C locale. If it will
use one or more internationalization functions, including accessing message
catalogs, it must execute:
setlocale(LC_ALL, "");
to switch to
the locale of its parent process before calling any internationalization function.
[ Top of Page | Previous Page | Next Page | Contents | Index | Library Home |
Legal |
Search ]