gawk: Arrays of Arrays

1 
1 8.6 Arrays of Arrays
1 ====================
1 
1 'gawk' goes beyond standard 'awk''s multidimensional array access and
1 provides true arrays of arrays.  Elements of a subarray are referred to
1 by their own indices enclosed in square brackets, just like the elements
1 of the main array.  For example, the following creates a two-element
1 subarray at index '1' of the main array 'a':
1 
1      a[1][1] = 1
1      a[1][2] = 2
1 
1    This simulates a true two-dimensional array.  Each subarray element
1 can contain another subarray as a value, which in turn can hold other
1 arrays as well.  In this way, you can create arrays of three or more
1 dimensions.  The indices can be any 'awk' expressions, including scalars
1 separated by commas (i.e., a regular 'awk' simulated multidimensional
1 subscript).  So the following is valid in 'gawk':
1 
1      a[1][3][1, "name"] = "barney"
1 
1    Each subarray and the main array can be of different length.  In
1 fact, the elements of an array or its subarray do not all have to have
1 the same type.  This means that the main array and any of its subarrays
1 can be nonrectangular, or jagged in structure.  You can assign a scalar
1 value to the index '4' of the main array 'a', even though 'a[1]' is
1 itself an array and not a scalar:
1 
1      a[4] = "An element in a jagged array"
1 
1    The terms "dimension", "row", and "column" are meaningless when
1 applied to such an array, but we will use "dimension" henceforth to
1 imply the maximum number of indices needed to refer to an existing
1 element.  The type of any element that has already been assigned cannot
1 be changed by assigning a value of a different type.  You have to first
1 delete the current element, which effectively makes 'gawk' forget about
1 the element at that index:
1 
1      delete a[4]
1      a[4][5][6][7] = "An element in a four-dimensional array"
1 
1 This removes the scalar value from index '4' and then inserts a
1 three-level nested subarray containing a scalar.  You can also delete an
1 entire subarray or subarray of subarrays:
1 
1      delete a[4][5]
1      a[4][5] = "An element in subarray a[4]"
1 
1    But recall that you can not delete the main array 'a' and then use it
1 as a scalar.
1 
1    The built-in functions that take array arguments can also be used
1 with subarrays.  For example, the following code fragment uses
1 'length()' (⇒String Functions) to determine the number of
1 elements in the main array 'a' and its subarrays:
1 
1      print length(a), length(a[1]), length(a[1][3])
1 
1 This results in the following output for our main array 'a':
1 
1      2, 3, 1
1 
1 The 'SUBSCRIPT in ARRAY' expression (⇒Reference to Elements)
1 works similarly for both regular 'awk'-style arrays and arrays of
1 arrays.  For example, the tests '1 in a', '3 in a[1]', and '(1, "name")
1 in a[1][3]' all evaluate to one (true) for our array 'a'.
1 
1    The 'for (item in array)' statement (⇒Scanning an Array) can
1 be nested to scan all the elements of an array of arrays if it is
1 rectangular in structure.  In order to print the contents (scalar
1 values) of a two-dimensional array of arrays (i.e., in which each
1 first-level element is itself an array, not necessarily of the same
1 length), you could use the following code:
1 
1      for (i in array)
1          for (j in array[i])
1              print array[i][j]
1 
1    The 'isarray()' function (⇒Type Functions) lets you test if an
1 array element is itself an array:
1 
1      for (i in array) {
1          if (isarray(array[i]) {
1              for (j in array[i]) {
1                  print array[i][j]
1              }
1          }
1          else
1              print array[i]
1      }
1 
1    If the structure of a jagged array of arrays is known in advance, you
1 can often devise workarounds using control statements.  For example, the
1 following code prints the elements of our main array 'a':
1 
1      for (i in a) {
1          for (j in a[i]) {
1              if (j == 3) {
1                  for (k in a[i][j])
1                      print a[i][j][k]
1              } else
1                  print a[i][j]
1          }
1      }
1 
1 ⇒Walking Arrays for a user-defined function that "walks" an
1 arbitrarily dimensioned array of arrays.
1 
1    Recall that a reference to an uninitialized array element yields a
1 value of '""', the null string.  This has one important implication when
1 you intend to use a subarray as an argument to a function, as
1 illustrated by the following example:
1 
1      $ gawk 'BEGIN { split("a b c d", b[1]); print b[1][1] }'
1      error-> gawk: cmd. line:1: fatal: split: second argument is not an array
1 
1    The way to work around this is to first force 'b[1]' to be an array
1 by creating an arbitrary index:
1 
1      $ gawk 'BEGIN { b[1][1] = ""; split("a b c d", b[1]); print b[1][1] }'
1      -| a
1