gawk: Fixed width data

1 
1 4.6.1 Processing Fixed-Width Data
1 ---------------------------------
1 
1 An example of fixed-width data would be the input for old Fortran
1 programs where numbers are run together, or the output of programs that
1 did not anticipate the use of their output as input for other programs.
1 
1    An example of the latter is a table where all the columns are lined
1 up by the use of a variable number of spaces and _empty fields are just
1 spaces_.  Clearly, 'awk''s normal field splitting based on 'FS' does not
1 work well in this case.  Although a portable 'awk' program can use a
1 series of 'substr()' calls on '$0' (⇒String Functions), this is
1 awkward and inefficient for a large number of fields.
1 
1    The splitting of an input record into fixed-width fields is specified
1 by assigning a string containing space-separated numbers to the built-in
1 variable 'FIELDWIDTHS'.  Each number specifies the width of the field,
1 _including_ columns between fields.  If you want to ignore the columns
1 between fields, you can specify the width as a separate field that is
1 subsequently ignored.  It is a fatal error to supply a field width that
1 has a negative value.
1 
1    The following data is the output of the Unix 'w' utility.  It is
1 useful to illustrate the use of 'FIELDWIDTHS':
1 
1       10:06pm  up 21 days, 14:04,  23 users
1      User     tty       login  idle   JCPU   PCPU  what
1      hzuo     ttyV0     8:58pm            9      5  vi p24.tex
1      hzang    ttyV3     6:37pm    50                -csh
1      eklye    ttyV5     9:53pm            7      1  em thes.tex
1      dportein ttyV6     8:17pm  1:47                -csh
1      gierd    ttyD3    10:00pm     1                elm
1      dave     ttyD4     9:47pm            4      4  w
1      brent    ttyp0    26Jun91  4:46  26:46   4:41  bash
1      dave     ttyq4    26Jun9115days     46     46  wnewmail
1 
1    The following program takes this input, converts the idle time to
1 number of seconds, and prints out the first two fields and the
1 calculated idle time:
1 
1      BEGIN  { FIELDWIDTHS = "9 6 10 6 7 7 35" }
1      NR > 2 {
1          idle = $4
1          sub(/^ +/, "", idle)   # strip leading spaces
1          if (idle == "")
1              idle = 0
1          if (idle ~ /:/) {      # hh:mm
1              split(idle, t, ":")
1              idle = t[1] * 60 + t[2]
1          }
1          if (idle ~ /days/)
1              idle *= 24 * 60 * 60
1 
1          print $1, $2, idle
1      }
1 
1      NOTE: The preceding program uses a number of 'awk' features that
1      haven't been introduced yet.
1 
1    Running the program on the data produces the following results:
1 
1      hzuo      ttyV0  0
1      hzang     ttyV3  50
1      eklye     ttyV5  0
1      dportein  ttyV6  107
1      gierd     ttyD3  1
1      dave      ttyD4  0
1      brent     ttyp0  286
1      dave      ttyq4  1296000
1 
1    Another (possibly more practical) example of fixed-width input data
1 is the input from a deck of balloting cards.  In some parts of the
1 United States, voters mark their choices by punching holes in computer
1 cards.  These cards are then processed to count the votes for any
1 particular candidate or on any particular issue.  Because a voter may
1 choose not to vote on some issue, any column on the card may be empty.
1 An 'awk' program for processing such data could use the 'FIELDWIDTHS'
1 feature to simplify reading the data.  (Of course, getting 'gawk' to run
1 on a system with card readers is another story!)
1