gawk: Flattening Arrays
1
1 16.4.11.3 Working With All The Elements of an Array
1 ...................................................
1
1 To "flatten" an array is to create a structure that represents the full
1 array in a fashion that makes it easy for C code to traverse the entire
1 array. Some of the code in 'extension/testext.c' does this, and also
1 serves as a nice example showing how to use the APIs.
1
1 We walk through that part of the code one step at a time. First, the
1 'gawk' script that drives the test extension:
1
1 @load "testext"
1 BEGIN {
1 n = split("blacky rusty sophie raincloud lucky", pets)
1 printf("pets has %d elements\n", length(pets))
1 ret = dump_array_and_delete("pets", "3")
1 printf("dump_array_and_delete(pets) returned %d\n", ret)
1 if ("3" in pets)
1 printf("dump_array_and_delete() did NOT remove index \"3\"!\n")
1 else
1 printf("dump_array_and_delete() did remove index \"3\"!\n")
1 print ""
1 }
1
1 This code creates an array with 'split()' (⇒String Functions) and
1 then calls 'dump_array_and_delete()'. That function looks up the array
1 whose name is passed as the first argument, and deletes the element at
1 the index passed in the second argument. The 'awk' code then prints the
1 return value and checks if the element was indeed deleted. Here is the
1 C code that implements 'dump_array_and_delete()'. It has been edited
1 slightly for presentation.
1
1 The first part declares variables, sets up the default return value
1 in 'result', and checks that the function was called with the correct
1 number of arguments:
1
1 static awk_value_t *
1 dump_array_and_delete(int nargs, awk_value_t *result)
1 {
1 awk_value_t value, value2, value3;
1 awk_flat_array_t *flat_array;
1 size_t count;
1 char *name;
1 int i;
1
1 assert(result != NULL);
1 make_number(0.0, result);
1
1 if (nargs != 2) {
1 printf("dump_array_and_delete: nargs not right "
1 "(%d should be 2)\n", nargs);
1 goto out;
1 }
1
1 The function then proceeds in steps, as follows. First, retrieve the
1 name of the array, passed as the first argument, followed by the array
1 itself. If either operation fails, print an error message and return:
1
1 /* get argument named array as flat array and print it */
1 if (get_argument(0, AWK_STRING, & value)) {
1 name = value.str_value.str;
1 if (sym_lookup(name, AWK_ARRAY, & value2))
1 printf("dump_array_and_delete: sym_lookup of %s passed\n",
1 name);
1 else {
1 printf("dump_array_and_delete: sym_lookup of %s failed\n",
1 name);
1 goto out;
1 }
1 } else {
1 printf("dump_array_and_delete: get_argument(0) failed\n");
1 goto out;
1 }
1
1 For testing purposes and to make sure that the C code sees the same
1 number of elements as the 'awk' code, the second step is to get the
1 count of elements in the array and print it:
1
1 if (! get_element_count(value2.array_cookie, & count)) {
1 printf("dump_array_and_delete: get_element_count failed\n");
1 goto out;
1 }
1
1 printf("dump_array_and_delete: incoming size is %lu\n",
1 (unsigned long) count);
1
1 The third step is to actually flatten the array, and then to
1 double-check that the count in the 'awk_flat_array_t' is the same as the
1 count just retrieved:
1
1 if (! flatten_array_typed(value2.array_cookie, & flat_array,
1 AWK_STRING, AWK_UNDEFINED)) {
1 printf("dump_array_and_delete: could not flatten array\n");
1 goto out;
1 }
1
1 if (flat_array->count != count) {
1 printf("dump_array_and_delete: flat_array->count (%lu)"
1 " != count (%lu)\n",
1 (unsigned long) flat_array->count,
1 (unsigned long) count);
1 goto out;
1 }
1
1 The fourth step is to retrieve the index of the element to be
1 deleted, which was passed as the second argument. Remember that
1 argument counts passed to 'get_argument()' are zero-based, and thus the
1 second argument is numbered one:
1
1 if (! get_argument(1, AWK_STRING, & value3)) {
1 printf("dump_array_and_delete: get_argument(1) failed\n");
1 goto out;
1 }
1
1 The fifth step is where the "real work" is done. The function loops
1 over every element in the array, printing the index and element values.
1 In addition, upon finding the element with the index that is supposed to
1 be deleted, the function sets the 'AWK_ELEMENT_DELETE' bit in the
1 'flags' field of the element. When the array is released, 'gawk'
1 traverses the flattened array, and deletes any elements that have this
1 flag bit set:
1
1 for (i = 0; i < flat_array->count; i++) {
1 printf("\t%s[\"%.*s\"] = %s\n",
1 name,
1 (int) flat_array->elements[i].index.str_value.len,
1 flat_array->elements[i].index.str_value.str,
1 valrep2str(& flat_array->elements[i].value));
1
1 if (strcmp(value3.str_value.str,
1 flat_array->elements[i].index.str_value.str) == 0) {
1 flat_array->elements[i].flags |= AWK_ELEMENT_DELETE;
1 printf("dump_array_and_delete: marking element \"%s\" "
1 "for deletion\n",
1 flat_array->elements[i].index.str_value.str);
1 }
1 }
1
1 The sixth step is to release the flattened array. This tells 'gawk'
1 that the extension is no longer using the array, and that it should
1 delete any elements marked for deletion. 'gawk' also frees any storage
1 that was allocated, so you should not use the pointer ('flat_array' in
1 this code) once you have called 'release_flattened_array()':
1
1 if (! release_flattened_array(value2.array_cookie, flat_array)) {
1 printf("dump_array_and_delete: could not release flattened array\n");
1 goto out;
1 }
1
1 Finally, because everything was successful, the function sets the
1 return value to success, and returns:
1
1 make_number(1.0, result);
1 out:
1 return result;
1 }
1
1 Here is the output from running this part of the test:
1
1 pets has 5 elements
1 dump_array_and_delete: sym_lookup of pets passed
1 dump_array_and_delete: incoming size is 5
1 pets["1"] = "blacky"
1 pets["2"] = "rusty"
1 pets["3"] = "sophie"
1 dump_array_and_delete: marking element "3" for deletion
1 pets["4"] = "raincloud"
1 pets["5"] = "lucky"
1 dump_array_and_delete(pets) returned 1
1 dump_array_and_delete() did remove index "3"!
1