gawk: Programs Exercises

1 
1 11.5 Exercises
1 ==============
1 
1   1. Rewrite 'cut.awk' (⇒Cut Program) using 'split()' with '""'
1      as the separator.
1 
1   2. In ⇒Egrep Program, we mentioned that 'egrep -i' could be
1      simulated in versions of 'awk' without 'IGNORECASE' by using
1      'tolower()' on the line and the pattern.  In a footnote there, we
1      also mentioned that this solution has a bug: the translated line is
1      output, and not the original one.  Fix this problem.
1 
1   3. The POSIX version of 'id' takes options that control which
1      Program::) to accept the same arguments and perform in the same
1      way.
1 
1   4. The 'split.awk' program (⇒Split Program) assumes that
1      letters are contiguous in the character set, which isn't true for
1      EBCDIC systems.  Fix this problem.  (Hint: Consider a different way
1      to work through the alphabet, without relying on 'ord()' and
1      'chr()'.)
1 
1   5. In 'uniq.awk' (⇒Uniq Program, the logic for choosing which
1      lines to print represents a "state machine", which is "a device
1      that can be in one of a set number of stable conditions depending
1      on its previous condition and on the present values of its
1      inputs."(1)  Brian Kernighan suggests that "an alternative approach
1      to state machines is to just read the input into an array, then use
1      indexing.  It's almost always easier code, and for most inputs
1      where you would use this, just as fast."  Rewrite the logic to
1      follow this suggestion.
1 
1   6. Why can't the 'wc.awk' program (⇒Wc Program) just use the
11      value of 'FNR' in 'endfile()'?  Hint: Examine the code in ⇒
      Filetrans Function.
1 
1   7. Manipulation of individual characters in the 'translate' program
1      (⇒Translate Program) is painful using standard 'awk'
1      functions.  Given that 'gawk' can split strings into individual
1      characters using '""' as the separator, how might you use this
1      feature to simplify the program?
1 
1   8. The 'extract.awk' program (⇒Extract Program) was written
1      before 'gawk' had the 'gensub()' function.  Use it to simplify the
1      code.
1 
1      Sed::) with the more straightforward:
1 
1           BEGIN {
1               pat = ARGV[1]
1               repl = ARGV[2]
1               ARGV[1] = ARGV[2] = ""
1           }
1 
1           { gsub(pat, repl); print }
1 
1   10. What are the advantages and disadvantages of 'awksed.awk' versus
1      the real 'sed' utility?
1 
1   11. In ⇒Igawk Program, we mentioned that not trying to save the
1      line read with 'getline' in the 'pathto()' function when testing
1      for the file's accessibility for use with the main program
1      simplifies things considerably.  What problem does this engender
1      though?
1 
1   12. As an additional example of the idea that it is not always
1      necessary to add new features to a program, consider the idea of
1      having two files in a directory in the search path:
1 
1      'default.awk'
1           This file contains a set of default library functions, such as
1           'getopt()' and 'assert()'.
1 
1      'site.awk'
1           This file contains library functions that are specific to a
1           site or installation; i.e., locally developed functions.
1           Having a separate file allows 'default.awk' to change with new
1           'gawk' releases, without requiring the system administrator to
1           update it each time by adding the local functions.
1 
1      One user suggested that 'gawk' be modified to automatically read
1      these files upon startup.  Instead, it would be very simple to
1      modify 'igawk' to do this.  Since 'igawk' can process nested
1      '@include' directives, 'default.awk' could simply contain
1      '@include' statements for the desired library functions.  Make this
1      change.
1 
1   13. Modify 'anagram.awk' (⇒Anagram Program), to avoid the use
1      of the external 'sort' utility.
1 
1    ---------- Footnotes ----------
1 
1    (1) This is the definition returned from entering 'define: state
1 machine' into Google.
1