gawk: Fixed width data
1
1 4.6.1 Processing Fixed-Width Data
1 ---------------------------------
1
1 An example of fixed-width data would be the input for old Fortran
1 programs where numbers are run together, or the output of programs that
1 did not anticipate the use of their output as input for other programs.
1
1 An example of the latter is a table where all the columns are lined
1 up by the use of a variable number of spaces and _empty fields are just
1 spaces_. Clearly, 'awk''s normal field splitting based on 'FS' does not
1 work well in this case. Although a portable 'awk' program can use a
1 series of 'substr()' calls on '$0' (⇒String Functions), this is
1 awkward and inefficient for a large number of fields.
1
1 The splitting of an input record into fixed-width fields is specified
1 by assigning a string containing space-separated numbers to the built-in
1 variable 'FIELDWIDTHS'. Each number specifies the width of the field,
1 _including_ columns between fields. If you want to ignore the columns
1 between fields, you can specify the width as a separate field that is
1 subsequently ignored. It is a fatal error to supply a field width that
1 has a negative value.
1
1 The following data is the output of the Unix 'w' utility. It is
1 useful to illustrate the use of 'FIELDWIDTHS':
1
1 10:06pm up 21 days, 14:04, 23 users
1 User tty login idle JCPU PCPU what
1 hzuo ttyV0 8:58pm 9 5 vi p24.tex
1 hzang ttyV3 6:37pm 50 -csh
1 eklye ttyV5 9:53pm 7 1 em thes.tex
1 dportein ttyV6 8:17pm 1:47 -csh
1 gierd ttyD3 10:00pm 1 elm
1 dave ttyD4 9:47pm 4 4 w
1 brent ttyp0 26Jun91 4:46 26:46 4:41 bash
1 dave ttyq4 26Jun9115days 46 46 wnewmail
1
1 The following program takes this input, converts the idle time to
1 number of seconds, and prints out the first two fields and the
1 calculated idle time:
1
1 BEGIN { FIELDWIDTHS = "9 6 10 6 7 7 35" }
1 NR > 2 {
1 idle = $4
1 sub(/^ +/, "", idle) # strip leading spaces
1 if (idle == "")
1 idle = 0
1 if (idle ~ /:/) { # hh:mm
1 split(idle, t, ":")
1 idle = t[1] * 60 + t[2]
1 }
1 if (idle ~ /days/)
1 idle *= 24 * 60 * 60
1
1 print $1, $2, idle
1 }
1
1 NOTE: The preceding program uses a number of 'awk' features that
1 haven't been introduced yet.
1
1 Running the program on the data produces the following results:
1
1 hzuo ttyV0 0
1 hzang ttyV3 50
1 eklye ttyV5 0
1 dportein ttyV6 107
1 gierd ttyD3 1
1 dave ttyD4 0
1 brent ttyp0 286
1 dave ttyq4 1296000
1
1 Another (possibly more practical) example of fixed-width input data
1 is the input from a deck of balloting cards. In some parts of the
1 United States, voters mark their choices by punching holes in computer
1 cards. These cards are then processed to count the votes for any
1 particular candidate or on any particular issue. Because a voter may
1 choose not to vote on some issue, any column on the card may be empty.
1 An 'awk' program for processing such data could use the 'FIELDWIDTHS'
1 feature to simplify reading the data. (Of course, getting 'gawk' to run
1 on a system with card readers is another story!)
1