gawk: Uninitialized Subscripts
1
1 8.3 Using Uninitialized Variables as Subscripts
1 ===============================================
1
1 Suppose it's necessary to write a program to print the input data in
1 reverse order. A reasonable attempt to do so (with some test data)
1 might look like this:
1
1 $ echo 'line 1
1 > line 2
1 > line 3' | awk '{ l[lines] = $0; ++lines }
1 > END {
1 > for (i = lines - 1; i >= 0; i--)
1 > print l[i]
1 > }'
1 -| line 3
1 -| line 2
1
1 Unfortunately, the very first line of input data did not appear in
1 the output!
1
1 Upon first glance, we would think that this program should have
1 worked. The variable 'lines' is uninitialized, and uninitialized
1 variables have the numeric value zero. So, 'awk' should have printed
1 the value of 'l[0]'.
1
1 The issue here is that subscripts for 'awk' arrays are _always_
1 strings. Uninitialized variables, when used as strings, have the value
1 '""', not zero. Thus, 'line 1' ends up stored in 'l[""]'. The
1 following version of the program works correctly:
1
1 { l[lines++] = $0 }
1 END {
1 for (i = lines - 1; i >= 0; i--)
1 print l[i]
1 }
1
1 Here, the '++' forces 'lines' to be numeric, thus making the "old
1 value" numeric zero. This is then converted to '"0"' as the array
1 subscript.
1
1 Even though it is somewhat unusual, the null string ('""') is a valid
1 array subscript. (d.c.) 'gawk' warns about the use of the null string
11 as a subscript if '--lint' is provided on the command line (⇒
Options).
1