gawk: Locales
1
1 6.6 Where You Are Makes a Difference
1 ====================================
1
1 Modern systems support the notion of "locales": a way to tell the system
1 about the local character set and language. The ISO C standard defines
1 a default '"C"' locale, which is an environment that is typical of what
1 many C programmers are used to.
1
1 Once upon a time, the locale setting used to affect regexp matching,
1 but this is no longer true (⇒Ranges and Locales).
1
1 Locales can affect record splitting. For the normal case of 'RS =
1 "\n"', the locale is largely irrelevant. For other single-character
1 record separators, setting 'LC_ALL=C' in the environment will give you
1 much better performance when reading records. Otherwise, 'gawk' has to
1 make several function calls, _per input character_, to find the record
1 terminator.
1
11 Locales can affect how dates and times are formatted (⇒Time
Functions). For example, a common way to abbreviate the date
1 September 4, 2015, in the United States is "9/4/15." In many countries
1 in Europe, however, it is abbreviated "4.9.15." Thus, the '%x'
1 specification in a '"US"' locale might produce '9/4/15', while in a
1 '"EUROPE"' locale, it might produce '4.9.15'.
1
1 According to POSIX, string comparison is also affected by locales
11 (similar to regular expressions). The details are presented in ⇒
POSIX String Comparison.
1
1 Finally, the locale affects the value of the decimal point character
1 used when 'gawk' parses input data. This is discussed in detail in
1 ⇒Conversion.
1