Software localization do's and don'ts

Business

Software localization do’s and don’ts

June 21, 2023June 21, 2023

Given the constant competitive pressure on software companies to accelerate market demands, many developers are working to tight deadlines to deliver working software. This software is often designed for localization once the source language version is ready for release.

Given these pressures, developers need to ensure that basic internationalization principles are followed when developing software to facilitate seamless localization efforts and meet market requirements for all required languages, not just the source language.

Here are the dos and don’ts that all developers should know and apply in their work to take advantage of the fastest and most cost-effective multi-language software localization services:

1. Externalize messages into message catalogs, resource files, and configuration files: Messages are textual objects and therefore are translatable components. These are catalogs or files that are installed in a place-specific location or named with a place-specific suffix. This practice will make the localization process easier, since localizers can work on these resource packs without the need to modify the source code. It will also make it easier to use a single source code for all languages, where only resource packs will have different language flavors.

2. Do not internationalize fixed text objects: fixed text objects like comments, commands and configuration settings, etc. should not be translated. Only externalize the strings that need translation.

If these objects appear in resource or configuration files, they should be marked with the “NOT_FOR_TRANSLATION” tag. Here are some examples of fixed text objects that do not require internationalization principles:

o Usernames, group names and passwords
o System or host names
o Names of terminals, printers and special devices
o Shell variables and environment variable names
o Message queues, semaphores, and shared memory labels
o UNIX commands and command line options
o Some GUI textual components, such as keyboard mnemonics and keyboard accelerators

3. Allow text expansion in messages (especially for GUI elements):

Applying the following expansion rules, when the source text is:

o 0 – 10 characters: The expansion required is 101 – 200%.
or 11 – 20 characters: 81 – 100%
or 21 – 30 characters: 61 – 80%
or 31 – 50 characters: 41 – 60%
or 50 – 70 characters: 31 – 40%
o More than 70 characters: 30%

But keep the length of the string well below your limit (usually 254 characters) to account for the extra characters needed.

4. Do not use variables when you can avoid them: Variables raise doubts in the translator’s mind about the gender of the term to be substituted, which makes it difficult to correctly translate the sentences that incorporate it. If variables are to be used, always provide a list of replacements. Also allow gender and plural variations in the translation of sentences incorporating the variable.

Do not use compound strings. A composite string is an error message or other text that is dynamically generated from partial prize segments and presented to the user in the form of a complete prize. Use complete sentences instead, even if you have to use repeating segments. This will ensure the accuracy of the translation, regardless of gender, plurality, conjugation, or sentence structure. Also, avoid using the same placeholders when using multiple variables in the same string, as the payout structure changes in different languages.

5. Perform Pseudo Translation – Pseudo translation is the process of replacing or adding characters to strings in your software to detect character encoding issues and encoded text that remains in source files.

6. Do not use IF conditions or rely on a sort order in your code to evaluate a string value: for example, avoid (IF Gender = “Female” THEN). Always use enum or unique IDs.

7. Use Unicode functions and methods to support all scripts: Applications that store and retrieve text data must accept and display characters from any given language. Using the Unicode encoding solves the problem of unsupported character sets along with displaying unwanted characters.

8. Don’t insert hard carriage returns in the middle of sentences. Translation memory tools turn off hard returns and assume that the adjudication is complete. Inserting hard carriage returns in the middle of a sentence generates incomplete sentences in the translation database and corrupts the sentence structure in the target language files. Instead, replace hard returns with soft returns or use a break tag like [BR]. Also, sentence structures, as well as the length of parts of sentences, change in different languages. Therefore, additional breaks in target languages may be required.

9. Choose your third-party software provider carefully: Insist that your third-party software be Unicode-compliant and comply with internationalization practices. If problems are found with third-party software, and if you don’t have control over your code to fix the problems, locating tasks become more difficult.

10. Do not use text in icons and bitmaps: the translated text may be too long to fit. Also, avoid using symbols with cultural connotations and place-specific idioms.

11. Use long dates or abbreviations for months instead of numbers when identifying dates: such as Month vs. daily orders in different parts of the world vary (eg US mm/dd/yy; Europe dd/mm/yy).

12. Do not sort strings alphabetically in string tables and resource bundles. Try to offer as much context as you can with externalized strings. This will help the translator to better adapt the translation to that context. If the context does not exist, it will take much longer for runtime QA to fix the translations.

Leave a Reply Cancel reply