1 1 4.5 Specifying How Fields Are Separated 1 ======================================= 1
1 · Default Field Splitting How fields are normally separated. · Regexp Field Splitting Using regexps as the field separator. · Single Character Fields Making each character a separate field. · Command Line Field Separator Setting 'FS' from the command line. · Full Line Fields Making the full line be a single field. · Field Splitting Summary Some final points and a summary table. 1 1 The "field separator", which is either a single character or a regular 1 expression, controls the way 'awk' splits an input record into fields. 1 'awk' scans the input record for character sequences that match the 1 separator; the fields themselves are the text between the matches. 1 1 In the examples that follow, we use the bullet symbol (*) to 1 represent spaces in the output. If the field separator is 'oo', then 1 the following line: 1 1 moo goo gai pan 1 1 is split into three fields: 'm', '*g', and '*gai*pan'. Note the leading 1 spaces in the values of the second and third fields. 1 1 The field separator is represented by the predefined variable 'FS'. 1 Shell programmers take note: 'awk' does _not_ use the name 'IFS' that is 1 used by the POSIX-compliant shells (such as the Unix Bourne shell, 'sh', 1 or Bash). 1 1 The value of 'FS' can be changed in the 'awk' program with the 1 assignment operator, '=' (⇒Assignment Ops). Often, the right 1 time to do this is at the beginning of execution before any input has 1 been processed, so that the very first record is read with the proper 11 separator. To do this, use the special 'BEGIN' pattern (⇒ BEGIN/END). For example, here we set the value of 'FS' to the string 1 '","': 1 1 awk 'BEGIN { FS = "," } ; { print $2 }' 1 1 Given the input line: 1 1 John Q. Smith, 29 Oak St., Walamazoo, MI 42139 1 1 this 'awk' program extracts and prints the string '*29*Oak*St.'. 1 1 Sometimes the input data contains separator characters that don't 1 separate fields the way you thought they would. For instance, the 1 person's name in the example we just used might have a title or suffix 1 attached, such as: 1 1 John Q. Smith, LXIX, 29 Oak St., Walamazoo, MI 42139 1 1 The same program would extract '*LXIX' instead of '*29*Oak*St.'. If you 1 were expecting the program to print the address, you would be surprised. 1 The moral is to choose your data layout and separator characters 1 carefully to prevent such problems. (If the data is not in a form that 1 is easy to process, perhaps you can massage it first with a separate 1 'awk' program.) 1