gawk: Basic High Level
1
1 D.1 What a Program Does
1 =======================
1
1 At the most basic level, the job of a program is to process some input
1 data and produce results. See ⇒Figure D.1 figure-general-flow.
1
1 [image src="general-program.png" alt="General program flow" text=" _______
1 +------+ / \\ +---------+
1 | Data | -----> < Program > -----> | Results |
1 +------+ \\_______/ +---------+" ]
1
1 Figure D.1: General Program Flow
1
1 The "program" in the figure can be either a compiled program(1) (such
1 as 'ls'), or it may be "interpreted". In the latter case, a
1 machine-executable program such as 'awk' reads your program, and then
1 uses the instructions in your program to process the data.
1
1 When you write a program, it usually consists of the following, very
1 basic set of steps, as shown in ⇒Figure D.2 figure-process-flow.:
1
1 [image src="process-flow.png" alt="Basic Program Stages" text=" ______
1 +----------------+ / More \\ No +----------+
1 | Initialization | -------> < Data > -------> | Clean Up |
1 +----------------+ ^ \\ ? / +----------+
1 | +--+-+
1 | | Yes
1 | |
1 | V
1 | +---------+
1 +-----+ Process |
1 +---------+" ]
1
1 Figure D.2: Basic Program Steps
1
1 Initialization
1 These are the things you do before actually starting to process
1 data, such as checking arguments, initializing any data you need to
1 work with, and so on. This step corresponds to 'awk''s 'BEGIN'
1 rule (⇒BEGIN/END).
1
1 If you were baking a cake, this might consist of laying out all the
1 mixing bowls and the baking pan, and making sure you have all the
1 ingredients that you need.
1
1 Processing
1 This is where the actual work is done. Your program reads data,
1 one logical chunk at a time, and processes it as appropriate.
1
1 In most programming languages, you have to manually manage the
1 reading of data, checking to see if there is more each time you
11 read a chunk. 'awk''s pattern-action paradigm (⇒Getting
Started) handles the mechanics of this for you.
1
1 In baking a cake, the processing corresponds to the actual labor:
1 breaking eggs, mixing the flour, water, and other ingredients, and
1 then putting the cake into the oven.
1
1 Clean Up
1 Once you've processed all the data, you may have things you need to
1 do before exiting. This step corresponds to 'awk''s 'END' rule
1 (⇒BEGIN/END).
1
1 After the cake comes out of the oven, you still have to wrap it in
1 plastic wrap to keep anyone from tasting it, as well as wash the
1 mixing bowls and utensils.
1
1 An "algorithm" is a detailed set of instructions necessary to
1 accomplish a task, or process data. It is much the same as a recipe for
1 baking a cake. Programs implement algorithms. Often, it is up to you
1 to design the algorithm and implement it, simultaneously.
1
1 The "logical chunks" we talked about previously are called "records",
1 similar to the records a company keeps on employees, a school keeps for
1 students, or a doctor keeps for patients. Each record has many
1 component parts, such as first and last names, date of birth, address,
1 and so on. The component parts are referred to as the "fields" of the
1 record.
1
1 The act of reading data is termed "input", and that of generating
1 results, not too surprisingly, is termed "output". They are often
1 referred to together as "input/output," and even more often, as "I/O"
1 for short. (You will also see "input" and "output" used as verbs.)
1
1 'awk' manages the reading of data for you, as well as the breaking it
1 up into records and fields. Your program's job is to tell 'awk' what to
1 do with the data. You do this by describing "patterns" in the data to
1 look for, and "actions" to execute when those patterns are seen. This
1 "data-driven" nature of 'awk' programs usually makes them both easier to
1 write and easier to read.
1
1 ---------- Footnotes ----------
1
1 (1) Compiled programs are typically written in lower-level languages
1 such as C, C++, or Ada, and then translated, or "compiled", into a form
1 that the computer can execute directly.
1