gawk: Changing Fields
1
1 4.4 Changing the Contents of a Field
1 ====================================
1
1 The contents of a field, as seen by 'awk', can be changed within an
1 'awk' program; this changes what 'awk' perceives as the current input
1 record. (The actual input is untouched; 'awk' _never_ modifies the
1 input file.) Consider the following example and its output:
1
1 $ awk '{ nboxes = $3 ; $3 = $3 - 10
1 > print nboxes, $3 }' inventory-shipped
1 -| 25 15
1 -| 32 22
1 -| 24 14
1 ...
1
1 The program first saves the original value of field three in the
1 variable 'nboxes'. The '-' sign represents subtraction, so this program
1 reassigns field three, '$3', as the original value of field three minus
1 ten: '$3 - 10'. (⇒Arithmetic Ops.) Then it prints the original
1 and new values for field three. (Someone in the warehouse made a
1 consistent mistake while inventorying the red boxes.)
1
1 For this to work, the text in '$3' must make sense as a number; the
1 string of characters must be converted to a number for the computer to
1 do arithmetic on it. The number resulting from the subtraction is
1 converted back to a string of characters that then becomes field three.
1 ⇒Conversion.
1
1 When the value of a field is changed (as perceived by 'awk'), the
1 text of the input record is recalculated to contain the new field where
1 the old one was. In other words, '$0' changes to reflect the altered
1 field. Thus, this program prints a copy of the input file, with 10
1 subtracted from the second field of each line:
1
1 $ awk '{ $2 = $2 - 10; print $0 }' inventory-shipped
1 -| Jan 3 25 15 115
1 -| Feb 5 32 24 226
1 -| Mar 5 24 34 228
1 ...
1
1 It is also possible to assign contents to fields that are out of
1 range. For example:
1
1 $ awk '{ $6 = ($5 + $4 + $3 + $2)
1 > print $6 }' inventory-shipped
1 -| 168
1 -| 297
1 -| 301
1 ...
1
1 We've just created '$6', whose value is the sum of fields '$2', '$3',
1 '$4', and '$5'. The '+' sign represents addition. For the file
1 'inventory-shipped', '$6' represents the total number of parcels shipped
1 for a particular month.
1
1 Creating a new field changes 'awk''s internal copy of the current
1 input record, which is the value of '$0'. Thus, if you do 'print $0'
1 after adding a field, the record printed includes the new field, with
1 the appropriate number of field separators between it and the previously
1 existing fields.
1
1 This recomputation affects and is affected by 'NF' (the number of
1 fields; ⇒Fields). For example, the value of 'NF' is set to the
1 number of the highest field you create. The exact format of '$0' is
1 also affected by a feature that has not been discussed yet: the "output
11 field separator", 'OFS', used to separate the fields (⇒Output
Separators).
1
1 Note, however, that merely _referencing_ an out-of-range field does
1 _not_ change the value of either '$0' or 'NF'. Referencing an
1 out-of-range field only produces an empty string. For example:
1
1 if ($(NF+1) != "")
1 print "can't happen"
1 else
1 print "everything is normal"
1
1 should print 'everything is normal', because 'NF+1' is certain to be out
1 of range. (⇒If Statement for more information about 'awk''s
1 'if-else' statements. ⇒Typing and Comparison for more
1 information about the '!=' operator.)
1
1 It is important to note that making an assignment to an existing
1 field changes the value of '$0' but does not change the value of 'NF',
1 even when you assign the empty string to a field. For example:
1
1 $ echo a b c d | awk '{ OFS = ":"; $2 = ""
1 > print $0; print NF }'
1 -| a::c:d
1 -| 4
1
1 The field is still there; it just has an empty value, delimited by the
1 two colons between 'a' and 'c'. This example shows what happens if you
1 create a new field:
1
1 $ echo a b c d | awk '{ OFS = ":"; $2 = ""; $6 = "new"
1 > print $0; print NF }'
1 -| a::c:d::new
1 -| 6
1
1 The intervening field, '$5', is created with an empty value (indicated
1 by the second pair of adjacent colons), and 'NF' is updated with the
1 value six.
1
1 Decrementing 'NF' throws away the values of the fields after the new
1 value of 'NF' and recomputes '$0'. (d.c.) Here is an example:
1
1 $ echo a b c d e f | awk '{ print "NF =", NF;
1 > NF = 3; print $0 }'
1 -| NF = 6
1 -| a b c
1
1 CAUTION: Some versions of 'awk' don't rebuild '$0' when 'NF' is
1 decremented.
1
1 Finally, there are times when it is convenient to force 'awk' to
1 rebuild the entire record, using the current values of the fields and
1 'OFS'. To do this, use the seemingly innocuous assignment:
1
1 $1 = $1 # force record to be reconstituted
1 print $0 # or whatever else with $0
1
1 This forces 'awk' to rebuild the record. It does help to add a comment,
1 as we've shown here.
1
1 There is a flip side to the relationship between '$0' and the fields.
1 Any assignment to '$0' causes the record to be reparsed into fields
1 using the _current_ value of 'FS'. This also applies to any built-in
11 function that updates '$0', such as 'sub()' and 'gsub()' (⇒String
Functions).
1
1 Understanding '$0'
1
1 It is important to remember that '$0' is the _full_ record, exactly
1 as it was read from the input. This includes any leading or trailing
1 whitespace, and the exact whitespace (or other characters) that
1 separates the fields.
1
1 It is a common error to try to change the field separators in a
1 record simply by setting 'FS' and 'OFS', and then expecting a plain
1 'print' or 'print $0' to print the modified record.
1
1 But this does not work, because nothing was done to change the record
1 itself. Instead, you must force the record to be rebuilt, typically
1 with a statement such as '$1 = $1', as described earlier.
1