gawk: Definition Syntax
1
1 9.2.1 Function Definition Syntax
1 --------------------------------
1
1 It's entirely fair to say that the awk syntax for local variable
1 definitions is appallingly awful.
1 -- _Brian Kernighan_
1
1 Definitions of functions can appear anywhere between the rules of an
1 'awk' program. Thus, the general form of an 'awk' program is extended
1 to include sequences of rules _and_ user-defined function definitions.
1 There is no need to put the definition of a function before all uses of
1 the function. This is because 'awk' reads the entire program before
1 starting to execute any of it.
1
1 The definition of a function named NAME looks like this:
1
1 'function' NAME'('[PARAMETER-LIST]')'
1 '{'
1 BODY-OF-FUNCTION
1 '}'
1
1 Here, NAME is the name of the function to define. A valid function name
1 is like a valid variable name: a sequence of letters, digits, and
1 underscores that doesn't start with a digit. Here too, only the 52
1 upper- and lowercase English letters may be used in a function name.
1 Within a single 'awk' program, any particular name can only be used as a
1 variable, array, or function.
1
1 PARAMETER-LIST is an optional list of the function's arguments and
1 local variable names, separated by commas. When the function is called,
1 the argument names are used to hold the argument values given in the
1 call.
1
1 A function cannot have two parameters with the same name, nor may it
1 have a parameter with the same name as the function itself.
1
1 CAUTION: According to the POSIX standard, function parameters
1 cannot have the same name as one of the special predefined
1 variables (⇒Built-in Variables), nor may a function
1 parameter have the same name as another function.
1
1 Not all versions of 'awk' enforce these restrictions. 'gawk'
11 always enforces the first restriction. With '--posix' (⇒
Options), it also enforces the second restriction.
1
1 Local variables act like the empty string if referenced where a
1 string value is required, and like zero if referenced where a numeric
1 value is required. This is the same as the behavior of regular
1 variables that have never been assigned a value. (There is more to
1 understand about local variables; ⇒Dynamic Typing.)
1
1 The BODY-OF-FUNCTION consists of 'awk' statements. It is the most
1 important part of the definition, because it says what the function
1 should actually _do_. The argument names exist to give the body a way
1 to talk about the arguments; local variables exist to give the body
1 places to keep temporary values.
1
1 Argument names are not distinguished syntactically from local
1 variable names. Instead, the number of arguments supplied when the
1 function is called determines how many argument variables there are.
1 Thus, if three argument values are given, the first three names in
1 PARAMETER-LIST are arguments and the rest are local variables.
1
1 It follows that if the number of arguments is not the same in all
1 calls to the function, some of the names in PARAMETER-LIST may be
1 arguments on some occasions and local variables on others. Another way
1 to think of this is that omitted arguments default to the null string.
1
1 Usually when you write a function, you know how many names you intend
1 to use for arguments and how many you intend to use as local variables.
1 It is conventional to place some extra space between the arguments and
1 the local variables, in order to document how your function is supposed
1 to be used.
1
1 During execution of the function body, the arguments and local
1 variable values hide, or "shadow", any variables of the same names used
1 in the rest of the program. The shadowed variables are not accessible
1 in the function definition, because there is no way to name them while
1 their names have been taken away for the arguments and local variables.
1 All other variables used in the 'awk' program can be referenced or set
1 normally in the function's body.
1
1 The arguments and local variables last only as long as the function
1 body is executing. Once the body finishes, you can once again access
1 the variables that were shadowed while the function was running.
1
1 The function body can contain expressions that call functions. They
1 can even call this function, either directly or by way of another
1 function. When this happens, we say the function is "recursive". The
1 act of a function calling itself is called "recursion".
1
1 All the built-in functions return a value to their caller.
1 User-defined functions can do so also, using the 'return' statement,
1 which is described in detail in ⇒Return Statement. Many of the
1 subsequent examples in this minor node use the 'return' statement.
1
1 In many 'awk' implementations, including 'gawk', the keyword
1 'function' may be abbreviated 'func'. (c.e.) However, POSIX only
1 specifies the use of the keyword 'function'. This actually has some
11 practical implications. If 'gawk' is in POSIX-compatibility mode (⇒
Options), then the following statement does _not_ define a function:
1
1 func foo() { a = sqrt($1) ; print a }
1
1 Instead, it defines a rule that, for each record, concatenates the value
1 of the variable 'func' with the return value of the function 'foo'. If
1 the resulting string is non-null, the action is executed. This is
1 probably not what is desired. ('awk' accepts this input as
1 syntactically valid, because functions may be used before they are
1 defined in 'awk' programs.(1))
1
1 To ensure that your 'awk' programs are portable, always use the
1 keyword 'function' when defining a function.
1
1 ---------- Footnotes ----------
1
1 (1) This program won't actually run, because 'foo()' is undefined.
1