gawk: Finding The Bug

1 
1 14.2.2 Finding the Bug
1 ----------------------
1 
1 Let's say that we are having a problem using (a faulty version of)
1 'uniq.awk' in the "field-skipping" mode, and it doesn't seem to be
1 catching lines which should be identical when skipping the first field,
1 such as:
1 
1      awk is a wonderful program!
1      gawk is a wonderful program!
1 
1    This could happen if we were thinking (C-like) of the fields in a
1 record as being numbered in a zero-based fashion, so instead of the
1 lines:
1 
1      clast = join(alast, fcount+1, n)
1      cline = join(aline, fcount+1, m)
1 
1 we wrote:
1 
1      clast = join(alast, fcount, n)
1      cline = join(aline, fcount, m)
1 
1    The first thing we usually want to do when trying to investigate a
1 problem like this is to put a breakpoint in the program so that we can
1 watch it at work and catch what it is doing wrong.  A reasonable spot
1 for a breakpoint in 'uniq.awk' is at the beginning of the function
1 'are_equal()', which compares the current line with the previous one.
1 To set the breakpoint, use the 'b' (breakpoint) command:
1 
1      gawk> b are_equal
1      -| Breakpoint 1 set at file `awklib/eg/prog/uniq.awk', line 63
1 
1    The debugger tells us the file and line number where the breakpoint
1 is.  Now type 'r' or 'run' and the program runs until it hits the
1 breakpoint for the first time:
1 
1      gawk> r
1      -| Starting program:
1      -| Stopping in Rule ...
1      -| Breakpoint 1, are_equal(n, m, clast, cline, alast, aline)
1               at `awklib/eg/prog/uniq.awk':63
1      -| 63          if (fcount == 0 && charcount == 0)
1      gawk>
1 
1    Now we can look at what's going on inside our program.  First of all,
1 let's see how we got to where we are.  At the prompt, we type 'bt'
1 (short for "backtrace"), and the debugger responds with a listing of the
1 current stack frames:
1 
1      gawk> bt
1      -| #0  are_equal(n, m, clast, cline, alast, aline)
1               at `awklib/eg/prog/uniq.awk':68
1      -| #1  in main() at `awklib/eg/prog/uniq.awk':88
1 
1    This tells us that 'are_equal()' was called by the main program at
1 line 88 of 'uniq.awk'.  (This is not a big surprise, because this is the
1 only call to 'are_equal()' in the program, but in more complex programs,
1 knowing who called a function and with what parameters can be the key to
1 finding the source of the problem.)
1 
1    Now that we're in 'are_equal()', we can start looking at the values
1 of some variables.  Let's say we type 'p n' ('p' is short for "print").
1 We would expect to see the value of 'n', a parameter to 'are_equal()'.
1 Actually, the debugger gives us:
1 
1      gawk> p n
1      -| n = untyped variable
1 
1 In this case, 'n' is an uninitialized local variable, because the
1 function was called without arguments (⇒Function Calls).
1 
1    A more useful variable to display might be the current record:
1 
1      gawk> p $0
1      -| $0 = "gawk is a wonderful program!"
1 
1 This might be a bit puzzling at first, as this is the second line of our
1 test input.  Let's look at 'NR':
1 
1      gawk> p NR
1      -| NR = 2
1 
1 So we can see that 'are_equal()' was only called for the second record
1 of the file.  Of course, this is because our program contains a rule for
1 'NR == 1':
1 
1      NR == 1 {
1          last = $0
1          next
1      }
1 
1    OK, let's just check that that rule worked correctly:
1 
1      gawk> p last
1      -| last = "awk is a wonderful program!"
1 
1    Everything we have done so far has verified that the program has
1 worked as planned, up to and including the call to 'are_equal()', so the
1 problem must be inside this function.  To investigate further, we must
1 begin "stepping through" the lines of 'are_equal()'.  We start by typing
1 'n' (for "next"):
1 
1      gawk> n
1      -| 66          if (fcount > 0) {
1 
1    This tells us that 'gawk' is now ready to execute line 66, which
1 decides whether to give the lines the special "field-skipping" treatment
1 indicated by the '-1' command-line option.  (Notice that we skipped from
1 where we were before, at line 63, to here, because the condition in line
1 63, 'if (fcount == 0 && charcount == 0)', was false.)
1 
1    Continuing to step, we now get to the splitting of the current and
1 last records:
1 
1      gawk> n
1      -| 67              n = split(last, alast)
1      gawk> n
1      -| 68              m = split($0, aline)
1 
1    At this point, we should be curious to see what our records were
1 split into, so we try to look:
1 
1      gawk> p n m alast aline
1      -| n = 5
1      -| m = untyped variable
1      -| alast = array, 5 elements
1      -| aline = untyped variable
1 
1 (The 'p' command can take more than one argument, similar to 'awk''s
1 'print' statement.)
1 
1    This is kind of disappointing, though.  All we found out is that
1 there are five elements in 'alast'; 'm' and 'aline' don't have values
1 because we are at line 68 but haven't executed it yet.  This information
1 is useful enough (we now know that none of the words were accidentally
1 left out), but what if we want to see inside the array?
1 
1    The first choice would be to use subscripts:
1 
1      gawk> p alast[0]
1      -| "0" not in array `alast'
1 
1 Oops!
1 
1      gawk> p alast[1]
1      -| alast["1"] = "awk"
1 
1    This would be kind of slow for a 100-member array, though, so 'gawk'
1 provides a shortcut (reminiscent of another language not to be
1 mentioned):
1 
1      gawk> p @alast
1      -| alast["1"] = "awk"
1      -| alast["2"] = "is"
1      -| alast["3"] = "a"
1      -| alast["4"] = "wonderful"
1      -| alast["5"] = "program!"
1 
1    It looks like we got this far OK. Let's take another step or two:
1 
1      gawk> n
1      -| 69              clast = join(alast, fcount, n)
1      gawk> n
1      -| 70              cline = join(aline, fcount, m)
1 
1    Well, here we are at our error (sorry to spoil the suspense).  What
1 we had in mind was to join the fields starting from the second one to
1 make the virtual record to compare, and if the first field were numbered
1 zero, this would work.  Let's look at what we've got:
1 
1      gawk> p cline clast
1      -| cline = "gawk is a wonderful program!"
1      -| clast = "awk is a wonderful program!"
1 
1    Hey, those look pretty familiar!  They're just our original,
1 unaltered input records.  A little thinking (the human brain is still
1 the best debugging tool), and we realize that we were off by one!
1 
1    We get out of the debugger:
1 
1      gawk> q
1      -| The program is running. Exit anyway (y/n)? y
1 
1 Then we get into an editor:
1 
1      clast = join(alast, fcount+1, n)
1      cline = join(aline, fcount+1, m)
1 
1 and problem solved!
1