gawk: Flattening Arrays

1 
1 16.4.11.3 Working With All The Elements of an Array
1 ...................................................
1 
1 To "flatten" an array is to create a structure that represents the full
1 array in a fashion that makes it easy for C code to traverse the entire
1 array.  Some of the code in 'extension/testext.c' does this, and also
1 serves as a nice example showing how to use the APIs.
1 
1    We walk through that part of the code one step at a time.  First, the
1 'gawk' script that drives the test extension:
1 
1      @load "testext"
1      BEGIN {
1          n = split("blacky rusty sophie raincloud lucky", pets)
1          printf("pets has %d elements\n", length(pets))
1          ret = dump_array_and_delete("pets", "3")
1          printf("dump_array_and_delete(pets) returned %d\n", ret)
1          if ("3" in pets)
1              printf("dump_array_and_delete() did NOT remove index \"3\"!\n")
1          else
1              printf("dump_array_and_delete() did remove index \"3\"!\n")
1          print ""
1      }
1 
1 This code creates an array with 'split()' (⇒String Functions) and
1 then calls 'dump_array_and_delete()'.  That function looks up the array
1 whose name is passed as the first argument, and deletes the element at
1 the index passed in the second argument.  The 'awk' code then prints the
1 return value and checks if the element was indeed deleted.  Here is the
1 C code that implements 'dump_array_and_delete()'.  It has been edited
1 slightly for presentation.
1 
1    The first part declares variables, sets up the default return value
1 in 'result', and checks that the function was called with the correct
1 number of arguments:
1 
1      static awk_value_t *
1      dump_array_and_delete(int nargs, awk_value_t *result)
1      {
1          awk_value_t value, value2, value3;
1          awk_flat_array_t *flat_array;
1          size_t count;
1          char *name;
1          int i;
1 
1          assert(result != NULL);
1          make_number(0.0, result);
1 
1          if (nargs != 2) {
1              printf("dump_array_and_delete: nargs not right "
1                     "(%d should be 2)\n", nargs);
1              goto out;
1          }
1 
1    The function then proceeds in steps, as follows.  First, retrieve the
1 name of the array, passed as the first argument, followed by the array
1 itself.  If either operation fails, print an error message and return:
1 
1          /* get argument named array as flat array and print it */
1          if (get_argument(0, AWK_STRING, & value)) {
1              name = value.str_value.str;
1              if (sym_lookup(name, AWK_ARRAY, & value2))
1                  printf("dump_array_and_delete: sym_lookup of %s passed\n",
1                         name);
1              else {
1                  printf("dump_array_and_delete: sym_lookup of %s failed\n",
1                         name);
1                  goto out;
1              }
1          } else {
1              printf("dump_array_and_delete: get_argument(0) failed\n");
1              goto out;
1          }
1 
1    For testing purposes and to make sure that the C code sees the same
1 number of elements as the 'awk' code, the second step is to get the
1 count of elements in the array and print it:
1 
1          if (! get_element_count(value2.array_cookie, & count)) {
1              printf("dump_array_and_delete: get_element_count failed\n");
1              goto out;
1          }
1 
1          printf("dump_array_and_delete: incoming size is %lu\n",
1                 (unsigned long) count);
1 
1    The third step is to actually flatten the array, and then to
1 double-check that the count in the 'awk_flat_array_t' is the same as the
1 count just retrieved:
1 
1          if (! flatten_array_typed(value2.array_cookie, & flat_array,
1                                    AWK_STRING, AWK_UNDEFINED)) {
1              printf("dump_array_and_delete: could not flatten array\n");
1              goto out;
1          }
1 
1          if (flat_array->count != count) {
1              printf("dump_array_and_delete: flat_array->count (%lu)"
1                     " != count (%lu)\n",
1                      (unsigned long) flat_array->count,
1                      (unsigned long) count);
1              goto out;
1          }
1 
1    The fourth step is to retrieve the index of the element to be
1 deleted, which was passed as the second argument.  Remember that
1 argument counts passed to 'get_argument()' are zero-based, and thus the
1 second argument is numbered one:
1 
1          if (! get_argument(1, AWK_STRING, & value3)) {
1              printf("dump_array_and_delete: get_argument(1) failed\n");
1              goto out;
1          }
1 
1    The fifth step is where the "real work" is done.  The function loops
1 over every element in the array, printing the index and element values.
1 In addition, upon finding the element with the index that is supposed to
1 be deleted, the function sets the 'AWK_ELEMENT_DELETE' bit in the
1 'flags' field of the element.  When the array is released, 'gawk'
1 traverses the flattened array, and deletes any elements that have this
1 flag bit set:
1 
1          for (i = 0; i < flat_array->count; i++) {
1              printf("\t%s[\"%.*s\"] = %s\n",
1                  name,
1                  (int) flat_array->elements[i].index.str_value.len,
1                  flat_array->elements[i].index.str_value.str,
1                  valrep2str(& flat_array->elements[i].value));
1 
1              if (strcmp(value3.str_value.str,
1                         flat_array->elements[i].index.str_value.str) == 0) {
1                  flat_array->elements[i].flags |= AWK_ELEMENT_DELETE;
1                  printf("dump_array_and_delete: marking element \"%s\" "
1                         "for deletion\n",
1                      flat_array->elements[i].index.str_value.str);
1              }
1          }
1 
1    The sixth step is to release the flattened array.  This tells 'gawk'
1 that the extension is no longer using the array, and that it should
1 delete any elements marked for deletion.  'gawk' also frees any storage
1 that was allocated, so you should not use the pointer ('flat_array' in
1 this code) once you have called 'release_flattened_array()':
1 
1          if (! release_flattened_array(value2.array_cookie, flat_array)) {
1              printf("dump_array_and_delete: could not release flattened array\n");
1              goto out;
1          }
1 
1    Finally, because everything was successful, the function sets the
1 return value to success, and returns:
1 
1          make_number(1.0, result);
1      out:
1          return result;
1      }
1 
1    Here is the output from running this part of the test:
1 
1      pets has 5 elements
1      dump_array_and_delete: sym_lookup of pets passed
1      dump_array_and_delete: incoming size is 5
1              pets["1"] = "blacky"
1              pets["2"] = "rusty"
1              pets["3"] = "sophie"
1      dump_array_and_delete: marking element "3" for deletion
1              pets["4"] = "raincloud"
1              pets["5"] = "lucky"
1      dump_array_and_delete(pets) returned 1
1      dump_array_and_delete() did remove index "3"!
1