127 ft. deep with Globalization

127 ft. deep with Globalization

Despite the spread of English through economic globalization, not all users of software speak English. Even though English is largely a second language throughout the world, neither all speakers are able to use the language efficiently in their work, nor everyone prefers having to use English to accomplish their daily tasks; this is particularly true at the end-user level. In other words, national language identity is very much alive all across, for very practical reasons.

As Wikipedia says, Globalization is the process by which businesses or other organizations develop international influence or start operating on an international scale. It means making applications work seamlessly, regardless of the user's language and culture.

Globalization starts with the design of the application to allow its use in non-English locales; this is called Internationalization. Features such as the date and currency format choices, dynamic resizing of user interface elements, ability to input, view, and display data using different character sets such as ideographic (Double-Byte Character Set) and simple text (Single-Byte Character Set) are all part of the concept of Internationalization. Complex bi-directional text languages such as Arabic and Hebrew must also be seamlessly supported. The ability to cater to regional law-enforced requirements must be factored in, else the product cannot be sold in target country. Internationalization, therefore, is the process of producing an application’s design and code which is free of any dependency on the language and culture-specific attributes in which this is being seen.

However, an internationalized application is not usable in any region of the world unless it is localized for that specific region. It must speak the local language in every sense of the word. With a solid foundation of internationalization in place, it is relatively easy to then localize it into the desired language. Localization is the process of adapting an internationalized application to a specific language, script, cultural, and coded character set environment. In localization, the same semantics are preserved while the syntax may be changed. Localization goes beyond text translation and also required to cater to local conventions as well. For instance, one can select Arabic as a language, but also Egypt as the specific locale of Arabic. Locale allows for locale-specific variations on the usage of the format, currency, spellchecker, punctuation, etc., all within the same language area.

Putting Internalization and Localization together we get what we call Globalization. As the convention goes, by counting the number of letters between the first and last letters of each of these words (Globalization = Internalization + Localization) this is referred to as G11N = I18N + L10N. However, for all practical purposes, G11N has more to it, other than I18N and L10N.

No alt text provided for this image

On a very high level, globalization looks just this, however as you go deep and start implementing things, there are hundreds of small and big decisions and activities that are required to be taken care of. In this context here is a list of 127 key considerations that can come handy for any globalization project. These considerations are all organized under following next level groups.

No alt text provided for this image
1. Globalization

2.   Internationalization 

3.     Strings

4.       Unicode: Process everything as unicode.

5.       Resource Files: Loading of multiple resource files should be 
available.

6.       String Comparison: Same charset strings should be compared.

7.       Ordering and Sorting: Charset specific ordering and sorting needs to be done.

8.       Concatenation: Language specific grammer rules are to be applied.

9.     Numbers 

10.       Numerals: Generally straight forward, but decimal places, format of numbers etc. are some considerations.

11.       Currency: Currency conversions, rates as on date might be required to process.

12.       Units: Unit conversion might be needed if same data is shown in different context.

13.     Date & Time

14.       Serialization: De-serialization of correct date and time needs to be handled well.

15.       Timezone Awareness: Ensure timezone is considered in persistence and arithmetic.

16.       Arithmetic: Date calculations should be looked upon carefully.

17.     Culture

18.       Sensitivity: In some cases, culture aware operations need to be initiated explicitly.

19.       Insensitivity: Not everything is required to be culture sensitive.

20.     Input

21.       Fonts: Font being used must support full characters of the locale which are put to use.

22.       Keyboards and IMEs: Correct Input Method Editor (IME) should be loaded.

23.       Keyboard Shortcuts: Shortcut keys must be locale centric.

24.     Output

25.       Mirroring: Correct directional output is configured.

26.       Media / Resource Path: Path strings on localized versions of OS may differ.

27.       Formatting: Locale specific formatting of information will need some attention.

28.     Persistence 

29.       Database: Decide to use same or multiple databases for various locales.

30.       File System: Locale-centric folder structure would be required.

31.       Cache: Multiple-locale aware cache operations can be tricky.

32.       Configuration Settings: Locale specific configuration settings need to be stored.

33.   Localization 

34.     Resources 

35.       Text

36.         Translation

37.           Phrasing

38.             Consistency: Consistency of translation across application need to be maintained.

39.             Vocabulary: Right vocabulary to be used as per every locale.

40.             Gist: Ensure that the crux of the message is maintained during translation.

41.           Symbols: Not all symbols might be available in all fonts / may have a different meaning in other language.

42.           Formatting: Placeholders need to be at right place in transalted text as per grammar of the language.

43.           Typography: Not all font styles may be supported by all fonts. 

44.           Culture Sensitivity: Messages may need a change to maintain culture affinity.

45.         Transliteration

46.           Automated: To incorporate an automated transliteration system, if required.

47.           Manual: Or to provide manual side-by-side entry provision.

48.       Graphics

49.         Culture Sensitivity: Same graphic, icon, colors may not be applicable in all cultures.

50.       Media

51.         Locale Specific Versions: Whole set to be made available for every supported locale.

52.         Locale Neutral Version: Something that everyone understands and is done in culture neutral manner.

53.       Fonts

54.         Culture Aware Default: A different default font for each culture should be configured.

55.     Layout

56.       Length of localized text: English text takes less space while others may take more.

57.       Flexible placement: Less or more text due to translation will need UI to adjust dynamically.

58.       Font size: Not all languages would look good on same font size.

59.       Directional flow: Some languages go right to left; adjust user interface dynamically.

60.     Content 

61.       Documentation

62.         User Documentation: End users might need a different language instructions than what administrators might want.

63.         Online Content: Online links that application may open need localization too and links also need to be context sensitive so correct language links are opened.

64.         Help Files: Cross-referencing across help files of multiple languages might also be required for decent fall-backs.

65.       User Data: Locale specific user data might require to be stored and processed; causing intelligent validation rules as per grammer of the language, e.g., 50 characters limit in English might need to be adjusted to a better number for Arabic locale.

66.   More 

67.     Localizability Review: After localization is done, a thorough review of the algorithms, logic, user interface, validations, etc. need to be done.

68.       User Interface

69.         Strings: Are right string resources being loaded?

70.         Messages

71.           Information Stuffing: Does the messages making sense after data (numbers, text) are stuffed in them at run time?

72.           Concatenation: If more than one messages are concatenated for some context, is the meaning coming right?

73.         System Dependent Nuances

74.           Dialog Boxes: Are right language dialog boxes being loaded from system libraries?

75.           Error Messages: Are system error messages coming from OS, right for the locale?

76.           Paper Sizes: When print dialog is opened, are all paper sizes that goes for selected locale being displayed?

77.           Folder/File Path Names: Are concatenated folder and file names that are generated by the application correct?

78.           File Extensions: Some custom file extensions, if used, need to be checked against various locales.

79.         Menus

80.           Shortcut Keys: Shortcut keys need to be adjusted as per translated text. E.g., (F)ile in English would be (D)atei in German. Shortcut keys will be different in these cases.

81.         Embedded Objects: Are embedded objects aywhere behaving correctly in various locales?

82.         Complex Text Nuances: Have different format and rules for different languages and cultures for all these below.

83.           Telephone Numbers

84.           Addresses

85.           Title Conventions

86.           Pluralization

87.           Punctuation

88.           Capitalization

89.       Executable Code

90.         Culture Sensitive Processing: If application is doing culture sensitive operations at right places, e.g., sorting of user data?

91.         Culture Neutral Processing: Some operations are not to be processed as culture specific and has to be neutral. E.g., the order in which files from a folder are being processed (generally by date and not by name).

92.       Fallback

93.         Culture

94.           Locale / Region: Are right locale specific resources being loaded?

95.           Language: If locale resources are not available, is it falling back to correct language resources?

96.         Font: Does font fallback working fine?

97.         Charset: Is right charset fallback hapenning?

98.         Media, Resources, etc.: Are defined fallback mechanism working for media and other resource files?

99.       Accessibility Requirements: Look for required accessibility requirements being fulfilled in run time.

100.       Automated Tests

101.         Deployment

102.           Locale Aware Resources: Does build script bundeling all required locale specific resources?

103.           Locale Specific System Dependencies: During installation, are locale specific system dependencies are being installed from OS?

104.         Text Handling

105.           Text Input: Are text input methods working correctly for various locales?

106.           Clipboard Operations: Does copy/paste operations working in non English locales?

107.           Font Independence: Does font have any major role to play in application? Is fallback defined correctly?

108.           DBCS Encoding: Does double-byte characte set text is stored and retrieved correctly?

109.           Buffer Size: For large text concatenation operations, is buffer overflow occurring in locales that use more characters than English?

110.         Locale Aware Data Persistence: Is data persistance locale aware, is required locale context is stored and retrieved correctly?

111.     Translation Services 

112.       Realtime: Decide if some real time translation service is to be used?

113.       Offline: Or is offline translation is the way to go?

114.     Packaging 

115.       Internationalized Code: Packaging for code base has to be separate than localized resources, so deployments are smaller in size and only required locales are installed separately.

116.       Localized Resources: All localized items, resources, media, etc. are to be packaged separately.

117.       Automated Build: An automated build process that can package code and resources separately and provide a user interface to install as per user's choice.

118.     Localization Plan

119.       Global Defaults: Ensure global defaults for various locales are defined correctly as last fallback.

120.       Localized Defaults: Ensure defaults for various locales are defined as first selection.

121.       Localization Order: Define what all locales you need to support and in what logical order they need to be processed. You may want to process similar locales together.

122.     Security Considerations 

123.       Internationalization

124.         Memory Buffers: Buffer overflow because of larger text in some locale may terminate program at some unknown location leaking sensitive information.

125.       Localization

126.         Malicious String Resources: Some translated text may contain malicious code.

127.         String Delimiters: String delimiters when translated incorrectly and such text is processed, may cause trouble.

Well written, specially useful is the flow charted way of representation ! A dumb man's two cents! Any opportunity for utilisation of Sanskrit in this context since there was talk of it being suitable to the "binary" world, bringing indianization in addition to localisation and globalisation since a vast amount of the bits and poetic software is generated in India or by Indians!! ? Pardon the curiosity if misplaced.

Like
Reply

I'm curious why you list DBCS Encoding. Wouldn't a better solution be to support Unicode?

The whole course in a single article.. Very neatly done.

Vikas, this is super useful, for the way you explain how localization, globalization, and internationalization are different and are yet related to each other. As a content strategist, sometimes I refer to *The Language of Content Strategy* for how internationalization is important for global content strategy (http://www.thelanguageofcontentstrategy.com/content/terms-and-definitions), and your post is an excellent reference for product teams.

To view or add a comment, sign in

More articles by Vikas Burman

  • Shoot ‘em up… 1… 2… and 3

    Space Invaders - A vertical rectangular video game that is a digital representation of a battle between aliens. Not…

    2 Comments
  • Web going offline.

    With a battery powered smartphone and an extra power bank in pocket, shifting among Wi-Fi, mobile network and…

    6 Comments
  • How to choose the right JavaScript framework?

    JavaScript frameworks may not be the first thing that come to our mind when we think about developing the next web…

    4 Comments

Others also viewed

Explore content categories