[ Next Article |
Previous Article |
Book Contents |
Library Home |
Legal |
Search ]
General Programming Concepts: Writing and Debugging Programs
National Language Support (NLS) Quick Reference
The NLS Quick Reference provides a place to get started internationalizing programs. The following sections offer advice and a practical guide through the NLS documentation:
National Language Support Do's and Don'ts
The following list presents a set of NLS guiding principles and advice. The intention is to prevent the occurrence of common errors when internationalizing programs. See the "National Language Support Overview for Programming"
for more information about NLS.
- DO externalize any user and error messages. We recommend the use of message catalogs. X applications may use resource files to externalize messages for each locale. See the "Message Facility Overview for Programming"
for more information.
- DO use standard X/Open, ISO/ANSI C, and POSIX functions to maximize portability. See "NLS Subroutines Overview"
for more information.
- DO use the font set specification in order to be code-set independent in X applications.
- DO use Xm (Motif) library widgets for building bidirectional and character shaping applicaitons. See "Layout (Bidirectional) Support in Xm (Motif) Library" in AIX Version 4.3 AIXwindows Programming Guide for general information. Refer to the XmText or XmTextField widgets for support of input and output of bidirectional and shaping characteristics.
- DON'T assume the size of all characters to be 8 bits, or 1 byte. Characters may be 1, 2, 3, 4 or more bytes. See "Multibyte Code and Wide Character Conversion Subroutines"
and the "Code Set Overview"
for more information.
- DON'T assume the encoding of any code set. See the "Code Set Overview"
for more information.
- DON'T hard code names of code sets, locales, or fonts because it may impact portability. See "National Language Support Overview for Programming"
for more information.
- DON'T use p++ to increment a pointer in a multibyte string. Use the mblen subroutine to determine the number of bytes that compose a character.
- DON'T assume any particular physical keyboard is in use. Use an input method based on the locale setting to handle keyboard input. See the "Input Method Overview"
for more information.
- DON'T define your own converter unless absolutely necessary. See the "Converters Overview for Programming"
for more information.
- DON'T assume that the char data type is either signed or unsigned. This is platform-specific. If the particular system that is used defines char to be signed, comparisons with full 8-bit quantity will yield incorrect results. As all the 8-bits are used in encoding a character, be sure to declare char as unsigned char wherever necessary. Also, note that if a signed char value is used to index an array, it may yield incorrect results. To make programs portable, define 8-bit characters as unsigned char.
- DON'T use the layout subroutines in the libi18n.a library unless the application is doing presentation types of services. Most applications just deal with logically ordered text. See "Introducing Layout Library Subroutines"
for more information.
National Language Support Checklist
The National Language Support (NLS) Checklist provides a way to analyze a program for NLS dependencies. By going through this list, one can determine what, if any, NLS functions must be considered. This is useful for both programming and testing. If you identify a set of NLS items that a program depends on, a test strategy can be developed. This facilitates a common approach to testing all programs.
All major NLS considerations have been identified. However, this list is not all-encompassing. There may be other NLS questions that are not listed. See "National Language Support Overview for Programming"
for more information about NLS.
See "National Language Support Do's and Don'ts" for a brief list of NLS advice.
Program Operation Checklist:
- Does the program display translatable messages to the user, either directly or indirectly? An example of indirect messages are those that are stored in libraries.
If yes:
- Does the program compare text strings?
If yes:
- Are the strings compared to check equality only?
If yes:
- Are the strings compared to see which one sorts before the other, as defined in the current locale?
If yes:
- Does the program parse path names of files?
If yes:
- If looking for / (slash), use the strchr subroutine.
- If looking for characters, be aware that the file names can include multibyte characters. In such cases, invoke the setlocale subroutine in the following manner and then use appropriate search subroutines:
setlocale(LC_ALL, "")
- Does the program use system names, such as node names, user names, printer names, and queue names?
If yes:
- System names can have multibyte characters.
- To identify a multibyte character, first invoke the setlocale subroutine in the following manner and then use appropriate subroutines in the library.
setlocale(LC_ALL, "")
- Does the program use character class properties, such as uppercase, lowercase, and alphabetic?
If yes:
- Invoke the setlocale subroutine in the following manner:
setlocale(LC_ALL, "")
- Do not make assumptions about character properties. Always use system subroutines to determine character properties.
- Are the characters restricted to single-byte code sets?
If yes:
- Does the program convert the case (upper or lower) of characters?
If yes:
- Invoke the setlocale subroutine in the following manner:
setlocale(LC_ALL, "")
- Are the characters restricted to single-byte code sets?
If yes:
- Use these conv subroutines: _tolower, _toupper, tolower, or toupper.
If not, the characters may be multibyte characters:
- Does the program keep track of cursor movement on a tty terminal?
If yes:
- Invoke the setlocale subroutine in the following manner:
setlocale(LC_ALL, "")
- You may need to determine the display column width of characters. Use the wcwidth or wcswidth subroutine.
- Does the program perform character I/O?
If yes:
- Invoke the setlocale subroutine in the following manner:
setlocale(LC_ALL, "")
- Are the characters restricted to single-byte code sets?
If yes:
- Use following subroutine families:
If not:
- Use following subroutine families:
- Does the program step through an array of characters?
If yes:
- Is the array limited to single-byte characters only?
If yes:
- Does not require setlocale(LC_ALL, "")
- If p is the pointer to this array of single-byte characters, step through this array using p++.
If not:
- Does the program need to know the maximum number of bytes used to encode a character within the code set?
If yes:
- Does the program format date or time numeric quantities?
If yes:
- Does the program format numeric quantities?
If yes:
- Invoke the setlocale subroutine in the following manner:
setlocale(LC_ALL, "")
- Use the nl_langinfo or localeconv subroutine to obtain the locale-specific information.
- Use the following pair of subroutines, as needed: printf, scanf.
- Does the program format monetary quantities?
If yes:
- Does the program search for strings or locate characters?
If yes:
- Are you looking for single-byte characters in single-byte text?
- Does not require setlocale(LC_ALL, "")
- Use standard libc string subroutines such as the strchr subroutine.
- Are you looking for characters in the range 0x00-0x3F (the unique code-point range)
?
- Are you looking for characters in the range 0x00-0xFF?
- Does the program perform regular-expression pattern matching?
If yes:
- Does the program ask the user for affirmative/negative responses?
If yes:
- Does the program use special box-drawing characters?
If yes:
- Do not use code set-specific box-drawing characters like those in IBM-850.
- Instead use the box-drawing characters and attributes specified in the terminfo file.
- Does the program perform culture-specific or locale-specific processing that is not addressed here?
If yes:
- Externalize the culture-specific modules. Do not make them part of the executable program.
- Load the modules at run time using subroutines provided by the system, such as the load subroutine.
- If the system does not provide such facilities, link them statically but provide them in a modular fashion.
AIXwindows CheckList
The remaining checklist items are specific to the AIXwindows systems.
- Does your client use labels, buttons, or other output-only widgets to display translatable messages?
If yes:
- Does your client use X resource files to define the text of labels, buttons, or text widgets?
If yes:
- Put all resources that need translation in one place.
- Consider using message catalogs for the text strings. See the "Message Facility Overview for Programming"
for more information.
- Do not use translated color names, since color names are restricted to one encoding. The only portable names are encoded in the portable character set.
- Put language-specific resource files in /usr/lib/X11/%L/app-defaults/%N, where %L is the name of the locale, such as fr_FR, and %N is the name of the client.
- Is keyboard input localized by language?
If yes:
- Invoke the *XtSetLanguageProc subroutine in the following manner:
XtSetLanguageProc(NULL, NULL, NULL);
- Use the XmText or XmTextField widgets for all text input.
Some of the XmText widgets' arguments are defined in terms of character length instead of byte length. The cursor position is maintained in character position, not byte position.
- Are you using the XmDrawingArea widget to do localized input?
- Use the input method subroutines to do input processing in different languages. See the "Input Method Overview"
and the IMAuxDraw Callback subroutine for more information.
- Does your client present lists or labels consisting of localized text from user files rather than from X resource files?
If yes:
- Does your program do any presentation operations (Xlib drawing, printing, formatting, or editing) on bidirectional text?
If yes:
- Use the XmText or XmTextField in the Xm (Motif) library. These widgets are enabled for bidirectional text. See "Layout (Bidirectional) Support in Xm (Motif) Library" in AIX Version 4.3 AIXwindows Programming Guide for more information.
- If the Xm library can not be used, use the layout subroutines to perform any re-ordering and shaping on the text. See "Introducing Layout Library Subroutines"
for more information.
- Store and communicate the text in the implicit (logical) form. Some utilities (for example, aixterm) support the visual form of bidirectional text, but most NLS subroutines can not process the visual form of bidirectional text.
If the response to all the above items is no, then the program probably has no NLS dependencies. In this case, you may not need the locale-setting subroutine setlocale and the catalog facility subroutines catopen and catgets.
Message Suggestions
The following are suggestions on how to make messages meaningful and concise:
- Plan for the translation of all messages, including messages that are displayed on panels.
- Externalize messages.
- Provide default messages.
- Make each message in a message source file be a complete entity. Building a message by concatenating parts
together makes translation difficult.
- Use the $len directive in the message source file to control the maximum display length of the message text. (The $len directive is specific to the Message Facility.)
- Allow sufficient space for translated messages to be displayed. Translated messages often occupy more display columns than the original message text. In general, allow about 20% to 30% more space for translated messages, but in some cases you may need to allow 100% more space for translated messages.
- Use symbolic identifiers to specify the set number and message number. Programs should refer to set numbers and message numbers by their symbolic identifiers, not by their actual numbers. (The use of symbolic identifiers is specific to the Message Facility.)
- Facilitate the reordering of sentence clauses by numbering the %s variables. This allows the translator to reorder the clauses if needed. For example, if a program needs to display the English message: The file %s is referenced in %s, a program may supply the two strings as follows:
printf(message_pointer, name1, name2)
The English message numbers the %s variables as follows:
The file %1$s is referenced in %2$s\n
The translated equivalent of this message may be:
%2$s contains a reference to file %1$s\n
- Do not use sys_errlist[errno] to obtain an error message. This defeats the purpose of externalizing messages. The sys_errlist[] is an array of error messages provided only in the English language. Use strerror(errno)
, as it obtains messages from catalogs.
- Do not use sys_siglist[signo] to obtain an error message. This defeats the purpose of externalizing messages. The sys_siglist[] is an array of error messages provided only in the English language. Use psignal()
, as it obtains messages from catalogs.
- Use the message comments facility to aid in the maintenance and translation of messages.
- In general, create separate message source files and catalogs for messages that apply to each command or utility.
Describing Command Syntax in Messages
Writing Style of Messages
Clear writing aids in message translation. The following guidelines on the writing style of messages include terminology, punctuation, mood, voice, tense, capitalization, format, and other usage questions.
- Write concise messages. One-sentence messages are preferable.
- Use complete-sentence format.
- Add articles (a, an, the) when necessary to eliminate ambiguity.
- Capitalize the first word of the sentence, and use a period at the end of the sentence.
- Use the present tense. Do not use future tense in a message. For example, use the sentence:
The cal command displays a calendar.
Instead of:
The cal command will display a calendar.
- Do not use the first person (I or we) in messages.
- Avoid using the second person (you) except in help and interactive text.
- Use active voice. The following example shows how a message written in passive voice can be turned into an active voice message.
Passive: Month and year must be entered as numbers.
Active: Enter month and year as numbers.
- Use the imperative mood (command phrase) and active verbs such as specify, use, check, choose, and wait.
- State messages in a positive tone. The following example shows a negative message made more positive.
Negative: Don't use the f option more than once.
Positive: Use the -f flag only once.
- Use words only in the grammatical categories shown in a dictionary. If a word is shown only as a noun, do not use it as a verb. For example, do not solution a problem or architect a system.
- Do not use prefixes or suffixes. Translators may not know what words beginning with re-, un-, in-, or non- mean, and the translations of messages that use prefixes or suffixes may not have the meaning you intended. Exceptions to this rule occur when the prefix is an integral part of a commonly used word. For example, the words previous and premature are acceptable; the word nonexistent is not acceptable.
- Do not use parentheses to show singular or plural, as in error(s), which cannot be translated. If you must show singular and plural, write error or errors. You may also be able to revise the code so that different messages are issued depending on whether the singular or plural of a word is required.
- Do not use contractions.
- Do not use quotation marks, both single and double quotation marks. For example, do not use quotation marks around variables such as %s, %c, and %d or around commands. Users may interpret the quotation marks literally.
- Do not hyphenate words at ends of lines.
- Do not use the standard highlighting guidelines in messages, and do not substitute initial or all caps for other highlighting practices. (Standard highlighting includes such guidelines as bold for commands, subroutines, and files; italics for variables and parameters; typewriter or courier for examples and displayed text.)
- Do not use the and/or construction. This construction does not exist in other languages. Usually it is better to say or to indicate that it is not necessary to do both.
- Use the 24-hour clock. Do not use a.m. or p.m. to specify time. For example, write 1:00 p.m. as 1300.
- Avoid acronyms. Only use acronyms that are better known to your audience than their spelled-out version. To make a plural of an acronym, add a lowercase s without an apostrophe. Verify that the acronym is not a trademark before using it.
- Do not construct messages from clauses. Use flags or other means within the program to pass on information so that a complete message may be issued at the proper time.
- Do not use hard-coded text as a variable for a %s string in a message.
- End the last line of the message with \n (indicating a new line). This applies to one-line messages also.
- Begin the second and remaining lines of a message with \t (indicating a tab).
- End all other lines with \n\ (indicating a new line).
- Force a newline on word boundaries where needed so that acceptable message strings display. The printf subroutine, which often is used to display the message text, disregards word boundaries and wraps text whenever necessary, sometimes splitting a word in the middle.
- If, for some reason, the message should not end with a newline character, leave writers a comment to that effect.
- Precede each message with the name of the command that called the message, followed by a colon. The following example is a message containing a command name:
OPIE "foo: Opening the file."
- Tell the user to Press the ------ key to select a key on the keyboard, including the specific key to press. For example:
Press the Ctrl-D key
- Do not tell the user to Try again later, unless the system is overloaded. The need to try again should be obvious from the message.
- Use the word "parameter" to describe text on the command line, the word "value" to indicate numeric data, and the words "command string" to describe the command with its parameters.
- Do not use commas to set off the one-thousandth place in values. For example, use 1000 instead of 1,000.
- If a message must be set off with an * (asterisk), use two asterisks at the beginning of the message and two at asterisks at the end of the message. For example:
** Total **
- Use the words "log in" and "log off" as verbs. For example:
Log in to the system; enter the data; then log off.
- Use the words "user name," "group name," and "login" as nouns. For example:
The user is sam.
The group name is staff.
The login directory is /u/sam.
- Do not use the word "superuser." Note that the root user may not have all privileges.
- Use the following frequently occurring standard messages where applicable:
Preferred Standard Message |
Less Desirable Message |
Cannot find or open the file. |
Can't open filename. |
Cannot find or access the file. |
Can't access |
The syntax of a parameter is not valid. |
syntax error |
Related Information
National Language Support Overview for Programming
. Locale Overview for System Management, How to Change the Language Environment, and How to Change Your Locale in AIX Version 4.3 System Management Guide: Operating System and Devices.
Code Set Overview, Code Set Strategy.
National Language Support Overview for System Management in AIX Version 4.3 System Management Guide: Operating System and Devices.
The chlang command, dspcat command, dspmsg command, gencat command, localedef , lslpp command, mkcatdefs command, runcat command in AIX Version 4.3 Commands Reference.
Understanding Code Set Strategy in AIX Kernel Extensions and Device Support Programming Concepts.
Character Set Description (charmap) source file format, Locale Definition source file format in AIX Version 4.3 Files Reference.
The environment file in AIX Version 4.3 Files Reference.
[ Next Article |
Previous Article |
Book Contents |
Library Home |
Legal |
Search ]