gawk: Arrays of Arrays
1
1 8.6 Arrays of Arrays
1 ====================
1
1 'gawk' goes beyond standard 'awk''s multidimensional array access and
1 provides true arrays of arrays. Elements of a subarray are referred to
1 by their own indices enclosed in square brackets, just like the elements
1 of the main array. For example, the following creates a two-element
1 subarray at index '1' of the main array 'a':
1
1 a[1][1] = 1
1 a[1][2] = 2
1
1 This simulates a true two-dimensional array. Each subarray element
1 can contain another subarray as a value, which in turn can hold other
1 arrays as well. In this way, you can create arrays of three or more
1 dimensions. The indices can be any 'awk' expressions, including scalars
1 separated by commas (i.e., a regular 'awk' simulated multidimensional
1 subscript). So the following is valid in 'gawk':
1
1 a[1][3][1, "name"] = "barney"
1
1 Each subarray and the main array can be of different length. In
1 fact, the elements of an array or its subarray do not all have to have
1 the same type. This means that the main array and any of its subarrays
1 can be nonrectangular, or jagged in structure. You can assign a scalar
1 value to the index '4' of the main array 'a', even though 'a[1]' is
1 itself an array and not a scalar:
1
1 a[4] = "An element in a jagged array"
1
1 The terms "dimension", "row", and "column" are meaningless when
1 applied to such an array, but we will use "dimension" henceforth to
1 imply the maximum number of indices needed to refer to an existing
1 element. The type of any element that has already been assigned cannot
1 be changed by assigning a value of a different type. You have to first
1 delete the current element, which effectively makes 'gawk' forget about
1 the element at that index:
1
1 delete a[4]
1 a[4][5][6][7] = "An element in a four-dimensional array"
1
1 This removes the scalar value from index '4' and then inserts a
1 three-level nested subarray containing a scalar. You can also delete an
1 entire subarray or subarray of subarrays:
1
1 delete a[4][5]
1 a[4][5] = "An element in subarray a[4]"
1
1 But recall that you can not delete the main array 'a' and then use it
1 as a scalar.
1
1 The built-in functions that take array arguments can also be used
1 with subarrays. For example, the following code fragment uses
1 'length()' (⇒String Functions) to determine the number of
1 elements in the main array 'a' and its subarrays:
1
1 print length(a), length(a[1]), length(a[1][3])
1
1 This results in the following output for our main array 'a':
1
1 2, 3, 1
1
1 The 'SUBSCRIPT in ARRAY' expression (⇒Reference to Elements)
1 works similarly for both regular 'awk'-style arrays and arrays of
1 arrays. For example, the tests '1 in a', '3 in a[1]', and '(1, "name")
1 in a[1][3]' all evaluate to one (true) for our array 'a'.
1
1 The 'for (item in array)' statement (⇒Scanning an Array) can
1 be nested to scan all the elements of an array of arrays if it is
1 rectangular in structure. In order to print the contents (scalar
1 values) of a two-dimensional array of arrays (i.e., in which each
1 first-level element is itself an array, not necessarily of the same
1 length), you could use the following code:
1
1 for (i in array)
1 for (j in array[i])
1 print array[i][j]
1
1 The 'isarray()' function (⇒Type Functions) lets you test if an
1 array element is itself an array:
1
1 for (i in array) {
1 if (isarray(array[i]) {
1 for (j in array[i]) {
1 print array[i][j]
1 }
1 }
1 else
1 print array[i]
1 }
1
1 If the structure of a jagged array of arrays is known in advance, you
1 can often devise workarounds using control statements. For example, the
1 following code prints the elements of our main array 'a':
1
1 for (i in a) {
1 for (j in a[i]) {
1 if (j == 3) {
1 for (k in a[i][j])
1 print a[i][j][k]
1 } else
1 print a[i][j]
1 }
1 }
1
1 ⇒Walking Arrays for a user-defined function that "walks" an
1 arbitrarily dimensioned array of arrays.
1
1 Recall that a reference to an uninitialized array element yields a
1 value of '""', the null string. This has one important implication when
1 you intend to use a subarray as an argument to a function, as
1 illustrated by the following example:
1
1 $ gawk 'BEGIN { split("a b c d", b[1]); print b[1][1] }'
1 error-> gawk: cmd. line:1: fatal: split: second argument is not an array
1
1 The way to work around this is to first force 'b[1]' to be an array
1 by creating an arbitrary index:
1
1 $ gawk 'BEGIN { b[1][1] = ""; split("a b c d", b[1]); print b[1][1] }'
1 -| a
1