gawk: Ranges

1 
1 7.1.3 Specifying Record Ranges with Patterns
1 --------------------------------------------
1 
1 A "range pattern" is made of two patterns separated by a comma, in the
1 form 'BEGPAT, ENDPAT'.  It is used to match ranges of consecutive input
1 records.  The first pattern, BEGPAT, controls where the range begins,
1 while ENDPAT controls where the pattern ends.  For example, the
1 following:
1 
1      awk '$1 == "on", $1 == "off"' myfile
1 
1 prints every record in 'myfile' between 'on'/'off' pairs, inclusive.
1 
1    A range pattern starts out by matching BEGPAT against every input
1 record.  When a record matches BEGPAT, the range pattern is "turned on",
1 and the range pattern matches this record as well.  As long as the range
1 pattern stays turned on, it automatically matches every input record
1 read.  The range pattern also matches ENDPAT against every input record;
1 when this succeeds, the range pattern is "turned off" again for the
1 following record.  Then the range pattern goes back to checking BEGPAT
1 against each record.
1 
1    The record that turns on the range pattern and the one that turns it
1 off both match the range pattern.  If you don't want to operate on these
1 records, you can write 'if' statements in the rule's action to
1 distinguish them from the records you are interested in.
1 
1    It is possible for a pattern to be turned on and off by the same
1 record.  If the record satisfies both conditions, then the action is
1 executed for just that record.  For example, suppose there is text
1 between two identical markers (e.g., the '%' symbol), each on its own
1 line, that should be ignored.  A first attempt would be to combine a
1 range pattern that describes the delimited text with the 'next'
1 statement (not discussed yet, ⇒Next Statement).  This causes
1 'awk' to skip any further processing of the current record and start
1 over again with the next input record.  Such a program looks like this:
1 
1      /^%$/,/^%$/    { next }
1                     { print }
1 
1 This program fails because the range pattern is both turned on and
1 turned off by the first line, which just has a '%' on it.  To accomplish
1 this task, write the program in the following manner, using a flag:
1 
1      /^%$/     { skip = ! skip; next }
1      skip == 1 { next } # skip lines with `skip' set
1 
1    In a range pattern, the comma (',') has the lowest precedence of all
1 the operators (i.e., it is evaluated last).  Thus, the following program
1 attempts to combine a range pattern with another, simpler test:
1 
1      echo Yes | awk '/1/,/2/ || /Yes/'
1 
1    The intent of this program is '(/1/,/2/) || /Yes/'.  However, 'awk'
1 interprets this as '/1/, (/2/ || /Yes/)'.  This cannot be changed or
1 worked around; range patterns do not combine with other patterns:
1 
1      $ echo Yes | gawk '(/1/,/2/) || /Yes/'
1      error-> gawk: cmd. line:1: (/1/,/2/) || /Yes/
1      error-> gawk: cmd. line:1:           ^ syntax error
1 
1    As a minor point of interest, although it is poor style, POSIX allows
1 you to put a newline after the comma in a range pattern.  (d.c.)
1