gawk: Feature History
1
1 A.6 History of 'gawk' Features
1 ==============================
1
1 This minor node describes the features in 'gawk' over and above those in
1 POSIX 'awk', in the order they were added to 'gawk'.
1
1 Version 2.10 of 'gawk' introduced the following features:
1
1 * The 'AWKPATH' environment variable for specifying a path search for
1 the '-f' command-line option (⇒Options).
1
11 * The 'IGNORECASE' variable and its effects (⇒
Case-sensitivity).
1
1 * The '/dev/stdin', '/dev/stdout', '/dev/stderr' and '/dev/fd/N'
1 special file names (⇒Special Files).
1
1 Version 2.13 of 'gawk' introduced the following features:
1
1 * The 'FIELDWIDTHS' variable and its effects (⇒Constant Size).
1
1 * The 'systime()' and 'strftime()' built-in functions for obtaining
1 and printing timestamps (⇒Time Functions).
1
1 * Additional command-line options (⇒Options):
1
1 - The '-W lint' option to provide error and portability checking
1 for both the source code and at runtime.
1
1 - The '-W compat' option to turn off the GNU extensions.
1
1 - The '-W posix' option for full POSIX compliance.
1
1 Version 2.14 of 'gawk' introduced the following feature:
1
11 * The 'next file' statement for skipping to the next data file (⇒
Nextfile Statement).
1
1 Version 2.15 of 'gawk' introduced the following features:
1
1 * New variables (⇒Built-in Variables):
1
1 - 'ARGIND', which tracks the movement of 'FILENAME' through
1 'ARGV'.
1
1 - 'ERRNO', which contains the system error message when
1 'getline' returns -1 or 'close()' fails.
1
1 * The '/dev/pid', '/dev/ppid', '/dev/pgrpid', and '/dev/user' special
1 file names. These have since been removed.
1
1 * The ability to delete all of an array at once with 'delete ARRAY'
1 (⇒Delete).
1
1 * Command-line option changes (⇒Options):
1
1 - The ability to use GNU-style long-named options that start
1 with '--'.
1
1 - The '--source' option for mixing command-line and library-file
1 source code.
1
1 Version 3.0 of 'gawk' introduced the following features:
1
1 * New or changed variables:
1
1 - 'IGNORECASE' changed, now applying to string comparison as
1 well as regexp operations (⇒Case-sensitivity).
1
11 - 'RT', which contains the input text that matched 'RS' (⇒
Records).
1
1 * Full support for both POSIX and GNU regexps (⇒Regexp).
1
11 * The 'gensub()' function for more powerful text manipulation (⇒
String Functions).
1
1 * The 'strftime()' function acquired a default time format, allowing
1 it to be called with no arguments (⇒Time Functions).
1
1 * The ability for 'FS' and for the third argument to 'split()' to be
1 null strings (⇒Single Character Fields).
1
1 * The ability for 'RS' to be a regexp (⇒Records).
1
11 * The 'next file' statement became 'nextfile' (⇒Nextfile
Statement).
1
1 * The 'fflush()' function from BWK 'awk' (then at Bell Laboratories;
1 ⇒I/O Functions).
1
1 * New command-line options:
1
1 - The '--lint-old' option to warn about constructs that are not
1 available in the original Version 7 Unix version of 'awk'
1 (⇒V7/SVR3.1).
1
1 - The '-m' option from BWK 'awk'. (Brian was still at Bell
1 Laboratories at the time.) This was later removed from both
1 his 'awk' and from 'gawk'.
1
1 - The '--re-interval' option to provide interval expressions in
1 regexps (⇒Regexp Operators).
1
1 - The '--traditional' option was added as a better name for
1 '--compat' (⇒Options).
1
11 * The use of GNU Autoconf to control the configuration process (⇒
Quick Installation).
1
1 * Amiga support. This has since been removed.
1
1 Version 3.1 of 'gawk' introduced the following features:
1
1 * New variables (⇒Built-in Variables):
1
1 - 'BINMODE', for non-POSIX systems, which allows binary I/O for
1 input and/or output files (⇒PC Using).
1
1 - 'LINT', which dynamically controls lint warnings.
1
1 - 'PROCINFO', an array for providing process-related
1 information.
1
1 - 'TEXTDOMAIN', for setting an application's
11 internationalization text domain (⇒
Internationalization).
1
1 * The ability to use octal and hexadecimal constants in 'awk' program
1 source code (⇒Nondecimal-numbers).
1
11 * The '|&' operator for two-way I/O to a coprocess (⇒Two-way
I/O).
1
11 * The '/inet' special files for TCP/IP networking using '|&' (⇒
TCP/IP Networking).
1
1 * The optional second argument to 'close()' that allows closing one
1 end of a two-way pipe to a coprocess (⇒Two-way I/O).
1
1 * The optional third argument to the 'match()' function for capturing
11 text-matching subexpressions within a regexp (⇒String
Functions).
1
1 * Positional specifiers in 'printf' formats for making translations
1 easier (⇒Printf Ordering).
1
1 * A number of new built-in functions:
1
1 - The 'asort()' and 'asorti()' functions for sorting arrays
1 (⇒Array Sorting).
1
1 - The 'bindtextdomain()', 'dcgettext()' and 'dcngettext()'
1 functions for internationalization (⇒Programmer i18n).
1
1 - The 'extension()' function and the ability to add new built-in
1 functions dynamically (⇒Dynamic Extensions).
1
11 - The 'mktime()' function for creating timestamps (⇒Time
Functions).
1
1 - The 'and()', 'or()', 'xor()', 'compl()', 'lshift()',
11 'rshift()', and 'strtonum()' functions (⇒Bitwise
Functions).
1
1 * The support for 'next file' as two words was removed completely
1 (⇒Nextfile Statement).
1
1 * Additional command-line options (⇒Options):
1
1 - The '--dump-variables' option to print a list of all global
1 variables.
1
1 - The '--exec' option, for use in CGI scripts.
1
1 - The '--gen-po' command-line option and the use of a leading
11 underscore to mark strings that should be translated (⇒
String Extraction).
1
1 - The '--non-decimal-data' option to allow non-decimal input
1 data (⇒Nondecimal Data).
1
1 - The '--profile' option and 'pgawk', the profiling version of
1 'gawk', for producing execution profiles of 'awk' programs
1 (⇒Profiling).
1
1 - The '--use-lc-numeric' option to force 'gawk' to use the
11 locale's decimal point for parsing input data (⇒
Conversion).
1
1 * The use of GNU Automake to help in standardizing the configuration
1 process (⇒Quick Installation).
1
11 * The use of GNU 'gettext' for 'gawk''s own message output (⇒
Gawk I18N).
1
1 * BeOS support. This was later removed.
1
1 * Tandem support. This was later removed.
1
1 * The Atari port became officially unsupported and was later removed
1 entirely.
1
1 * The source code changed to use ISO C standard-style function
1 definitions.
1
1 * POSIX compliance for 'sub()' and 'gsub()' (⇒Gory Details).
1
1 * The 'length()' function was extended to accept an array argument
11 and return the number of elements in the array (⇒String
Functions).
1
1 * The 'strftime()' function acquired a third argument to enable
1 printing times as UTC (⇒Time Functions).
1
1 Version 4.0 of 'gawk' introduced the following features:
1
1 * Variable additions:
1
1 - 'FPAT', which allows you to specify a regexp that matches the
11 fields, instead of matching the field separator (⇒
Splitting By Content).
1
1 - If 'PROCINFO["sorted_in"]' exists, 'for(iggy in foo)' loops
1 sort the indices before looping over them. The value of this
1 element provides control over how the indices are sorted
11 before the loop traversal starts (⇒Controlling
Scanning).
1
1 - 'PROCINFO["strftime"]', which holds the default format for
1 'strftime()' (⇒Time Functions).
1
1 * The special files '/dev/pid', '/dev/ppid', '/dev/pgrpid' and
1 '/dev/user' were removed.
1
1 * Support for IPv6 was added via the '/inet6' special file. '/inet4'
1 forces IPv4 and '/inet' chooses the system default, which is
1 probably IPv4 (⇒TCP/IP Networking).
1
1 * The use of '\s' and '\S' escape sequences in regular expressions
1 (⇒GNU Regexp Operators).
1
1 * Interval expressions became part of default regular expressions
1 (⇒Regexp Operators).
1
11 * POSIX character classes work even with '--traditional' (⇒
Regexp Operators).
1
1 * 'break' and 'continue' became invalid outside a loop, even with
DONTPRINTYET 1 '--traditional' (⇒Break Statement, and also see *note1DONTPRINTYET 1 '--traditional' (⇒Break Statement, and also see ⇒
Continue Statement).
1
1 * 'fflush()', 'nextfile', and 'delete ARRAY' are allowed if '--posix'
1 or '--traditional', since they are all now part of POSIX.
1
1 * An optional third argument to 'asort()' and 'asorti()', specifying
1 how to sort (⇒String Functions).
1
1 * The behavior of 'fflush()' changed to match BWK 'awk' and for
1 POSIX; now both 'fflush()' and 'fflush("")' flush all open output
1 redirections (⇒I/O Functions).
1
1 * The 'isarray()' function which distinguishes if an item is an array
11 or not, to make it possible to traverse arrays of arrays (⇒
Type Functions).
1
1 * The 'patsplit()' function which gives the same capability as
1 'FPAT', for splitting (⇒String Functions).
1
1 * An optional fourth argument to the 'split()' function, which is an
11 array to hold the values of the separators (⇒String
Functions).
1
1 * Arrays of arrays (⇒Arrays of Arrays).
1
11 * The 'BEGINFILE' and 'ENDFILE' special patterns (⇒
BEGINFILE/ENDFILE).
1
1 * Indirect function calls (⇒Indirect Calls).
1
11 * 'switch' / 'case' are enabled by default (⇒Switch
Statement).
1
1 * Command-line option changes (⇒Options):
1
1 - The '-b' and '--characters-as-bytes' options which prevent
1 'gawk' from treating input as a multibyte string.
1
1 - The redundant '--compat', '--copyleft', and '--usage' long
1 options were removed.
1
1 - The '--gen-po' option was finally renamed to the correct
1 '--gen-pot'.
1
1 - The '--sandbox' option which disables certain features.
1
1 - All long options acquired corresponding short options, for use
1 in '#!' scripts.
1
1 * Directories named on the command line now produce a warning, not a
11 fatal error, unless '--posix' or '--traditional' are used (⇒
Command-line directories).
1
1 * The 'gawk' internals were rewritten, bringing the 'dgawk' debugger
1 and possibly improved performance (⇒Debugger).
1
1 * Per the GNU Coding Standards, dynamic extensions must now define a
11 global symbol indicating that they are GPL-compatible (⇒Plugin
License).
1
1 * In POSIX mode, string comparisons use 'strcoll()' / 'wcscoll()'
1 (⇒POSIX String Comparison).
1
1 * The option for raw sockets was removed, since it was never
1 implemented (⇒TCP/IP Networking).
1
1 * Ranges of the form '[d-h]' are treated as if they were in the C
1 locale, no matter what kind of regexp is being used, and even if
1 '--posix' (⇒Ranges and Locales).
1
1 * Support was removed for the following systems:
1
1 - Atari
1
1 - Amiga
1
1 - BeOS
1
1 - Cray
1
1 - MIPS RiscOS
1
1 - MS-DOS with the Microsoft Compiler
1
1 - MS-Windows with the Microsoft Compiler
1
1 - NeXT
1
1 - SunOS 3.x, Sun 386 (Road Runner)
1
1 - Tandem (non-POSIX)
1
1 - Prestandard VAX C compiler for VAX/VMS
1
1 Version 4.1 of 'gawk' introduced the following features:
1
1 * Three new arrays: 'SYMTAB', 'FUNCTAB', and
1 'PROCINFO["identifiers"]' (⇒Auto-set).
1
1 * The three executables 'gawk', 'pgawk', and 'dgawk', were merged
1 into one, named just 'gawk'. As a result the command-line options
1 changed.
1
1 * Command-line option changes (⇒Options):
1
1 - The '-D' option invokes the debugger.
1
1 - The '-i' and '--include' options load 'awk' library files.
1
1 - The '-l' and '--load' options load compiled dynamic
1 extensions.
1
1 - The '-M' and '--bignum' options enable MPFR.
1
1 - The '-o' option only does pretty-printing.
1
1 - The '-p' option is used for profiling.
1
1 - The '-R' option was removed.
1
11 * Support for high precision arithmetic with MPFR (⇒Arbitrary
Precision Arithmetic).
1
1 * The 'and()', 'or()' and 'xor()' functions changed to allow any
11 number of arguments, with a minimum of two (⇒Bitwise
Functions).
1
11 * The dynamic extension interface was completely redone (⇒
Dynamic Extensions).
1
1 * Redirected 'getline' became allowed inside 'BEGINFILE' and
1 'ENDFILE' (⇒BEGINFILE/ENDFILE).
1
11 * The 'where' command was added to the debugger (⇒Execution
Stack).
1
1 * Support for Ultrix was removed.
1
1 Version 4.2 of 'gawk' introduced the following changes:
1
1 * Changes to 'ENVIRON' are reflected into 'gawk''s environment and
1 that of programs that it runs. ⇒Auto-set.
1
1 * 'FIELDWIDTHS' was enhanced to allow skipping characters before
1 assigning a value to a field (⇒Splitting By Content).
1
1 * The 'PROCINFO["argv"]' array. ⇒Auto-set.
1
1 * The maximum number of hexadecimal digits in '\x' escapes is now
1 two. ⇒Escape Sequences.
1
11 * Strongly typed regexp constants of the form '@/.../' (⇒Strong
Regexp Constants).
1
1 * The bitwise functions changed, making negative arguments into a
1 fatal error (⇒Bitwise Functions).
1
1 * The 'mktime()' function now accepts an optional second argument
1 (⇒Time Functions).
1
1 * The 'typeof()' function (⇒Type Functions).
1
1 * Optimizations are enabled by default. Use '-s' / '--no-optimize'
1 to disable optimizations.
1
1 * For many years, POSIX specified that default field splitting only
1 allowed spaces and tabs to separate fields, and this was how 'gawk'
1 behaved with '--posix'. As of 2013, the standard restored
1 historical behavior, and now default field splitting with '--posix'
1 also allows newlines to separate fields.
1
1 * Nonfatal output with 'print' and 'printf'. ⇒Nonfatal.
1
11 * Retryable I/O via 'PROCINFO[INPUT-FILE, "RETRY"]'; (⇒Retrying
Input).
1
1 * Changes to the pretty-printer (⇒Profiling):
1
1 - The '--pretty-print' option no longer runs the 'awk' program
1 too.
1
1 - Comments in the source program are preserved and placed into
1 the output file.
1
1 - Explicit parentheses for expressions in the input are
1 preserved in the generated output.
1
1 * Improvements to the extension API (⇒Dynamic Extensions):
1
1 - The 'get_file()' function to access open redirections.
1
1 - The 'nonfatal()' function for generating nonfatal error
1 messages.
1
1 - Support for GMP and MPFR values.
1
1 - Input parsers can now override the default field parsing
1 mechanism by specifying explicit locations.
1
1 * Shell startup files are supplied with the distribution and
1 installed by 'make install' (⇒Shell Startup Files).
1
1 * The 'igawk' program and its manual page are no longer installed
1 when 'gawk' is built. ⇒Igawk Program.
1
1 * Support for MirBSD was removed.
1
1 * Support for GNU/Linux on Alpha was removed.
1