Info: (gawk) POSIX String Comparison

⇖ Info Catalog ← gawk: Comparison Operators ↑ gawk: Typing and Comparison

gawk: POSIX String Comparison

1 
1 6.3.2.3 String Comparison Based on Locale Collating Order
1 .........................................................
1 
1 The POSIX standard used to say that all string comparisons are performed
1 based on the locale's "collating order".  This is the order in which
11 characters sort, as defined by the locale (for more discussion, ⇒
 Locales).  This order is usually very different from the results
1 obtained when doing straight byte-by-byte comparison.(1)
1 
1    Because this behavior differs considerably from existing practice,
1 'gawk' only implemented it when in POSIX mode (⇒Options).  Here
1 is an example to illustrate the difference, in an 'en_US.UTF-8' locale:
1 
1      $ gawk 'BEGIN { printf("ABC < abc = %s\n",
1      >                     ("ABC" < "abc" ? "TRUE" : "FALSE")) }'
1      -| ABC < abc = TRUE
1      $ gawk --posix 'BEGIN { printf("ABC < abc = %s\n",
1      >                             ("ABC" < "abc" ? "TRUE" : "FALSE")) }'
1      -| ABC < abc = FALSE
1 
1    Fortunately, as of August 2016, comparison based on locale collating
1 order is no longer required for the '==' and '!=' operators.(2)
1 However, comparison based on locales is still required for '<', '<=',
1 '>', and '>='.  POSIX thus recommends as follows:
1 
1      Since the '==' operator checks whether strings are identical, not
1      whether they collate equally, applications needing to check whether
1      strings collate equally can use:
1 
1           a <= b && a >= b
1 
1    As of version 4.2, 'gawk' continues to use locale collating order for
1 '<', '<=', '>', and '>=' only in POSIX mode.
1 
1    ---------- Footnotes ----------
1 
1    (1) Technically, string comparison is supposed to behave the same way
1 as if the strings were compared with the C 'strcoll()' function.
1 
1    (2) See the Austin Group website
1 (http://austingroupbugs.net/view.php?id=1070).
1

⇖ Info Catalog ← gawk: Comparison Operators ↑ gawk: Typing and Comparison