gawk: Basic High Level

1 
1 D.1 What a Program Does
1 =======================
1 
1 At the most basic level, the job of a program is to process some input
1 data and produce results.  See ⇒Figure D.1 figure-general-flow.
1 
1 [image src="general-program.png" alt="General program flow" text="                  _______
1 +------+         /       \\         +---------+
1 | Data | -----> < Program > -----> | Results |
1 +------+         \\_______/         +---------+"]
1 
1 Figure D.1: General Program Flow
1 
1    The "program" in the figure can be either a compiled program(1) (such
1 as 'ls'), or it may be "interpreted".  In the latter case, a
1 machine-executable program such as 'awk' reads your program, and then
1 uses the instructions in your program to process the data.
1 
1    When you write a program, it usually consists of the following, very
1 basic set of steps, as shown in ⇒Figure D.2 figure-process-flow.:
1 
1 [image src="process-flow.png" alt="Basic Program Stages" text="                              ______
1 +----------------+           / More \\  No       +----------+
1 | Initialization | -------> <  Data  > -------> | Clean Up |
1 +----------------+    ^      \\   ?  /           +----------+
1                       |       +--+-+
1                       |          | Yes
1                       |          |
1                       |          V
1                       |     +---------+
1                       +-----+ Process |
1                             +---------+"]
1 
1 Figure D.2: Basic Program Steps
1 
1 Initialization
1      These are the things you do before actually starting to process
1      data, such as checking arguments, initializing any data you need to
1      work with, and so on.  This step corresponds to 'awk''s 'BEGIN'
1      rule (⇒BEGIN/END).
1 
1      If you were baking a cake, this might consist of laying out all the
1      mixing bowls and the baking pan, and making sure you have all the
1      ingredients that you need.
1 
1 Processing
1      This is where the actual work is done.  Your program reads data,
1      one logical chunk at a time, and processes it as appropriate.
1 
1      In most programming languages, you have to manually manage the
1      reading of data, checking to see if there is more each time you
11      read a chunk.  'awk''s pattern-action paradigm (⇒Getting
      Started) handles the mechanics of this for you.
1 
1      In baking a cake, the processing corresponds to the actual labor:
1      breaking eggs, mixing the flour, water, and other ingredients, and
1      then putting the cake into the oven.
1 
1 Clean Up
1      Once you've processed all the data, you may have things you need to
1      do before exiting.  This step corresponds to 'awk''s 'END' rule
1      (⇒BEGIN/END).
1 
1      After the cake comes out of the oven, you still have to wrap it in
1      plastic wrap to keep anyone from tasting it, as well as wash the
1      mixing bowls and utensils.
1 
1    An "algorithm" is a detailed set of instructions necessary to
1 accomplish a task, or process data.  It is much the same as a recipe for
1 baking a cake.  Programs implement algorithms.  Often, it is up to you
1 to design the algorithm and implement it, simultaneously.
1 
1    The "logical chunks" we talked about previously are called "records",
1 similar to the records a company keeps on employees, a school keeps for
1 students, or a doctor keeps for patients.  Each record has many
1 component parts, such as first and last names, date of birth, address,
1 and so on.  The component parts are referred to as the "fields" of the
1 record.
1 
1    The act of reading data is termed "input", and that of generating
1 results, not too surprisingly, is termed "output".  They are often
1 referred to together as "input/output," and even more often, as "I/O"
1 for short.  (You will also see "input" and "output" used as verbs.)
1 
1    'awk' manages the reading of data for you, as well as the breaking it
1 up into records and fields.  Your program's job is to tell 'awk' what to
1 do with the data.  You do this by describing "patterns" in the data to
1 look for, and "actions" to execute when those patterns are seen.  This
1 "data-driven" nature of 'awk' programs usually makes them both easier to
1 write and easier to read.
1 
1    ---------- Footnotes ----------
1 
1    (1) Compiled programs are typically written in lower-level languages
1 such as C, C++, or Ada, and then translated, or "compiled", into a form
1 that the computer can execute directly.
1