gawk: Changing Fields

1 
1 4.4 Changing the Contents of a Field
1 ====================================
1 
1 The contents of a field, as seen by 'awk', can be changed within an
1 'awk' program; this changes what 'awk' perceives as the current input
1 record.  (The actual input is untouched; 'awk' _never_ modifies the
1 input file.)  Consider the following example and its output:
1 
1      $ awk '{ nboxes = $3 ; $3 = $3 - 10
1      >        print nboxes, $3 }' inventory-shipped
1      -| 25 15
1      -| 32 22
1      -| 24 14
1      ...
1 
1 The program first saves the original value of field three in the
1 variable 'nboxes'.  The '-' sign represents subtraction, so this program
1 reassigns field three, '$3', as the original value of field three minus
1 ten: '$3 - 10'.  (⇒Arithmetic Ops.)  Then it prints the original
1 and new values for field three.  (Someone in the warehouse made a
1 consistent mistake while inventorying the red boxes.)
1 
1    For this to work, the text in '$3' must make sense as a number; the
1 string of characters must be converted to a number for the computer to
1 do arithmetic on it.  The number resulting from the subtraction is
1 converted back to a string of characters that then becomes field three.
1 ⇒Conversion.
1 
1    When the value of a field is changed (as perceived by 'awk'), the
1 text of the input record is recalculated to contain the new field where
1 the old one was.  In other words, '$0' changes to reflect the altered
1 field.  Thus, this program prints a copy of the input file, with 10
1 subtracted from the second field of each line:
1 
1      $ awk '{ $2 = $2 - 10; print $0 }' inventory-shipped
1      -| Jan 3 25 15 115
1      -| Feb 5 32 24 226
1      -| Mar 5 24 34 228
1      ...
1 
1    It is also possible to assign contents to fields that are out of
1 range.  For example:
1 
1      $ awk '{ $6 = ($5 + $4 + $3 + $2)
1      >        print $6 }' inventory-shipped
1      -| 168
1      -| 297
1      -| 301
1      ...
1 
1 We've just created '$6', whose value is the sum of fields '$2', '$3',
1 '$4', and '$5'.  The '+' sign represents addition.  For the file
1 'inventory-shipped', '$6' represents the total number of parcels shipped
1 for a particular month.
1 
1    Creating a new field changes 'awk''s internal copy of the current
1 input record, which is the value of '$0'.  Thus, if you do 'print $0'
1 after adding a field, the record printed includes the new field, with
1 the appropriate number of field separators between it and the previously
1 existing fields.
1 
1    This recomputation affects and is affected by 'NF' (the number of
1 fields; ⇒Fields).  For example, the value of 'NF' is set to the
1 number of the highest field you create.  The exact format of '$0' is
1 also affected by a feature that has not been discussed yet: the "output
11 field separator", 'OFS', used to separate the fields (⇒Output
 Separators).
1 
1    Note, however, that merely _referencing_ an out-of-range field does
1 _not_ change the value of either '$0' or 'NF'.  Referencing an
1 out-of-range field only produces an empty string.  For example:
1 
1      if ($(NF+1) != "")
1          print "can't happen"
1      else
1          print "everything is normal"
1 
1 should print 'everything is normal', because 'NF+1' is certain to be out
1 of range.  (⇒If Statement for more information about 'awk''s
1 'if-else' statements.  ⇒Typing and Comparison for more
1 information about the '!=' operator.)
1 
1    It is important to note that making an assignment to an existing
1 field changes the value of '$0' but does not change the value of 'NF',
1 even when you assign the empty string to a field.  For example:
1 
1      $ echo a b c d | awk '{ OFS = ":"; $2 = ""
1      >                       print $0; print NF }'
1      -| a::c:d
1      -| 4
1 
1 The field is still there; it just has an empty value, delimited by the
1 two colons between 'a' and 'c'.  This example shows what happens if you
1 create a new field:
1 
1      $ echo a b c d | awk '{ OFS = ":"; $2 = ""; $6 = "new"
1      >                       print $0; print NF }'
1      -| a::c:d::new
1      -| 6
1 
1 The intervening field, '$5', is created with an empty value (indicated
1 by the second pair of adjacent colons), and 'NF' is updated with the
1 value six.
1 
1    Decrementing 'NF' throws away the values of the fields after the new
1 value of 'NF' and recomputes '$0'.  (d.c.)  Here is an example:
1 
1      $ echo a b c d e f | awk '{ print "NF =", NF;
1      >                           NF = 3; print $0 }'
1      -| NF = 6
1      -| a b c
1 
1      CAUTION: Some versions of 'awk' don't rebuild '$0' when 'NF' is
1      decremented.
1 
1    Finally, there are times when it is convenient to force 'awk' to
1 rebuild the entire record, using the current values of the fields and
1 'OFS'.  To do this, use the seemingly innocuous assignment:
1 
1      $1 = $1   # force record to be reconstituted
1      print $0  # or whatever else with $0
1 
1 This forces 'awk' to rebuild the record.  It does help to add a comment,
1 as we've shown here.
1 
1    There is a flip side to the relationship between '$0' and the fields.
1 Any assignment to '$0' causes the record to be reparsed into fields
1 using the _current_ value of 'FS'.  This also applies to any built-in
11 function that updates '$0', such as 'sub()' and 'gsub()' (⇒String
 Functions).
1 
1                           Understanding '$0'
1 
1    It is important to remember that '$0' is the _full_ record, exactly
1 as it was read from the input.  This includes any leading or trailing
1 whitespace, and the exact whitespace (or other characters) that
1 separates the fields.
1 
1    It is a common error to try to change the field separators in a
1 record simply by setting 'FS' and 'OFS', and then expecting a plain
1 'print' or 'print $0' to print the modified record.
1 
1    But this does not work, because nothing was done to change the record
1 itself.  Instead, you must force the record to be rebuilt, typically
1 with a statement such as '$1 = $1', as described earlier.
1