Internationalization
Introduction
This chapter describes solutions to some common requirements when internationalizing C++ programs. Making software work in different locales (usually referred to as localization) usually requires solving two problems: formatting user-visible strings such that they obey local conventions (such as those for date, time, money, and numbers), and reconciling data in different character sets. This chapter deals mostly with the first issue, and only briefly with the second, because there is little standardized support for different character sets since most aspects of it are largely implementation dependent.
Most software will also run in countries other than the one where it was written. To support this practical reality, the C++ standard library has several facilities for writing code that will run in different countries. The design of these facilities, however, is different than many other standard library facilities such as strings, file input and output, containers, algorithms, and so forth. For example, the class that is used to represent a locale is locale, and is provided in the header. locale provides facilities for writing to and reading from streams using locale-specific formatting, and for getting information about a locale, such as the currency symbol or the date format. The standard only requires that a single locale be provided though, and that is the "C" or classic locale. The classic locale uses ANSI C conventions: American English conventions and 7-bit ASCII character encoding. It is up to the implementation whether it will provide locale instances for the various languages and regions.
There are three fundamental parts to the header. First, there is the locale class. It encapsulates all aspects of behavior for a locale that C++ supports, and it is your entry point to the different kinds of locale information you need to do locale-aware formatting. Second, the most granular part of a locale, and the concrete classes you will be working with, are called facets. An example of a facet is a class such as time_put for writing a date to a stream. Third, each facet belongs to a category, which is a way of grouping related facets together. Examples of categories are numeric, time, and monetary (the time_put facet I mentioned a moment ago belongs to the time category). I mention categories briefly in this chapter, but they only really come in handy when you are doing some more sophisticated stuff with locales, so I don't cover their use in depth here.
Every C++ program has at least one locale, referred to as the global locale (it is often implemented as a global static object). By default, it is the classic "C" locale unless you change it to something else. One of the locale constructors allows you to instantiate the user's preferred locale, although an implementation is free to define exactly what a user's "preferred" locale is.
In most cases, you will only use locales when writing to or reading from streams. This is the main focus of this chapter.