gawk: Programs Exercises
1
1 11.5 Exercises
1 ==============
1
1 1. Rewrite 'cut.awk' (⇒Cut Program) using 'split()' with '""'
1 as the separator.
1
1 2. In ⇒Egrep Program, we mentioned that 'egrep -i' could be
1 simulated in versions of 'awk' without 'IGNORECASE' by using
1 'tolower()' on the line and the pattern. In a footnote there, we
1 also mentioned that this solution has a bug: the translated line is
1 output, and not the original one. Fix this problem.
1
1 3. The POSIX version of 'id' takes options that control which
1 Program::) to accept the same arguments and perform in the same
1 way.
1
1 4. The 'split.awk' program (⇒Split Program) assumes that
1 letters are contiguous in the character set, which isn't true for
1 EBCDIC systems. Fix this problem. (Hint: Consider a different way
1 to work through the alphabet, without relying on 'ord()' and
1 'chr()'.)
1
1 5. In 'uniq.awk' (⇒Uniq Program, the logic for choosing which
1 lines to print represents a "state machine", which is "a device
1 that can be in one of a set number of stable conditions depending
1 on its previous condition and on the present values of its
1 inputs."(1) Brian Kernighan suggests that "an alternative approach
1 to state machines is to just read the input into an array, then use
1 indexing. It's almost always easier code, and for most inputs
1 where you would use this, just as fast." Rewrite the logic to
1 follow this suggestion.
1
1 6. Why can't the 'wc.awk' program (⇒Wc Program) just use the
11 value of 'FNR' in 'endfile()'? Hint: Examine the code in ⇒
Filetrans Function.
1
1 7. Manipulation of individual characters in the 'translate' program
1 (⇒Translate Program) is painful using standard 'awk'
1 functions. Given that 'gawk' can split strings into individual
1 characters using '""' as the separator, how might you use this
1 feature to simplify the program?
1
1 8. The 'extract.awk' program (⇒Extract Program) was written
1 before 'gawk' had the 'gensub()' function. Use it to simplify the
1 code.
1
1 Sed::) with the more straightforward:
1
1 BEGIN {
1 pat = ARGV[1]
1 repl = ARGV[2]
1 ARGV[1] = ARGV[2] = ""
1 }
1
1 { gsub(pat, repl); print }
1
1 10. What are the advantages and disadvantages of 'awksed.awk' versus
1 the real 'sed' utility?
1
1 11. In ⇒Igawk Program, we mentioned that not trying to save the
1 line read with 'getline' in the 'pathto()' function when testing
1 for the file's accessibility for use with the main program
1 simplifies things considerably. What problem does this engender
1 though?
1
1 12. As an additional example of the idea that it is not always
1 necessary to add new features to a program, consider the idea of
1 having two files in a directory in the search path:
1
1 'default.awk'
1 This file contains a set of default library functions, such as
1 'getopt()' and 'assert()'.
1
1 'site.awk'
1 This file contains library functions that are specific to a
1 site or installation; i.e., locally developed functions.
1 Having a separate file allows 'default.awk' to change with new
1 'gawk' releases, without requiring the system administrator to
1 update it each time by adding the local functions.
1
1 One user suggested that 'gawk' be modified to automatically read
1 these files upon startup. Instead, it would be very simple to
1 modify 'igawk' to do this. Since 'igawk' can process nested
1 '@include' directives, 'default.awk' could simply contain
1 '@include' statements for the desired library functions. Make this
1 change.
1
1 13. Modify 'anagram.awk' (⇒Anagram Program), to avoid the use
1 of the external 'sort' utility.
1
1 ---------- Footnotes ----------
1
1 (1) This is the definition returned from entering 'define: state
1 machine' into Google.
1