gawk: Feature History

1 
1 A.6 History of 'gawk' Features
1 ==============================
1 
1 This minor node describes the features in 'gawk' over and above those in
1 POSIX 'awk', in the order they were added to 'gawk'.
1 
1    Version 2.10 of 'gawk' introduced the following features:
1 
1    * The 'AWKPATH' environment variable for specifying a path search for
1      the '-f' command-line option (⇒Options).
1 
11    * The 'IGNORECASE' variable and its effects (⇒
      Case-sensitivity).
1 
1    * The '/dev/stdin', '/dev/stdout', '/dev/stderr' and '/dev/fd/N'
1      special file names (⇒Special Files).
1 
1    Version 2.13 of 'gawk' introduced the following features:
1 
1    * The 'FIELDWIDTHS' variable and its effects (⇒Constant Size).
1 
1    * The 'systime()' and 'strftime()' built-in functions for obtaining
1      and printing timestamps (⇒Time Functions).
1 
1    * Additional command-line options (⇒Options):
1 
1         - The '-W lint' option to provide error and portability checking
1           for both the source code and at runtime.
1 
1         - The '-W compat' option to turn off the GNU extensions.
1 
1         - The '-W posix' option for full POSIX compliance.
1 
1    Version 2.14 of 'gawk' introduced the following feature:
1 
11    * The 'next file' statement for skipping to the next data file (⇒
      Nextfile Statement).
1 
1    Version 2.15 of 'gawk' introduced the following features:
1 
1    * New variables (⇒Built-in Variables):
1 
1         - 'ARGIND', which tracks the movement of 'FILENAME' through
1           'ARGV'.
1 
1         - 'ERRNO', which contains the system error message when
1           'getline' returns -1 or 'close()' fails.
1 
1    * The '/dev/pid', '/dev/ppid', '/dev/pgrpid', and '/dev/user' special
1      file names.  These have since been removed.
1 
1    * The ability to delete all of an array at once with 'delete ARRAY'
1      (⇒Delete).
1 
1    * Command-line option changes (⇒Options):
1 
1         - The ability to use GNU-style long-named options that start
1           with '--'.
1 
1         - The '--source' option for mixing command-line and library-file
1           source code.
1 
1    Version 3.0 of 'gawk' introduced the following features:
1 
1    * New or changed variables:
1 
1         - 'IGNORECASE' changed, now applying to string comparison as
1           well as regexp operations (⇒Case-sensitivity).
1 
11         - 'RT', which contains the input text that matched 'RS' (⇒
           Records).
1 
1    * Full support for both POSIX and GNU regexps (⇒Regexp).
1 
11    * The 'gensub()' function for more powerful text manipulation (⇒
      String Functions).
1 
1    * The 'strftime()' function acquired a default time format, allowing
1      it to be called with no arguments (⇒Time Functions).
1 
1    * The ability for 'FS' and for the third argument to 'split()' to be
1      null strings (⇒Single Character Fields).
1 
1    * The ability for 'RS' to be a regexp (⇒Records).
1 
11    * The 'next file' statement became 'nextfile' (⇒Nextfile
      Statement).
1 
1    * The 'fflush()' function from BWK 'awk' (then at Bell Laboratories;
1      ⇒I/O Functions).
1 
1    * New command-line options:
1 
1         - The '--lint-old' option to warn about constructs that are not
1           available in the original Version 7 Unix version of 'awk'
1           (⇒V7/SVR3.1).
1 
1         - The '-m' option from BWK 'awk'.  (Brian was still at Bell
1           Laboratories at the time.)  This was later removed from both
1           his 'awk' and from 'gawk'.
1 
1         - The '--re-interval' option to provide interval expressions in
1           regexps (⇒Regexp Operators).
1 
1         - The '--traditional' option was added as a better name for
1           '--compat' (⇒Options).
1 
11    * The use of GNU Autoconf to control the configuration process (⇒
      Quick Installation).
1 
1    * Amiga support.  This has since been removed.
1 
1    Version 3.1 of 'gawk' introduced the following features:
1 
1    * New variables (⇒Built-in Variables):
1 
1         - 'BINMODE', for non-POSIX systems, which allows binary I/O for
1           input and/or output files (⇒PC Using).
1 
1         - 'LINT', which dynamically controls lint warnings.
1 
1         - 'PROCINFO', an array for providing process-related
1           information.
1 
1         - 'TEXTDOMAIN', for setting an application's
11           internationalization text domain (⇒
           Internationalization).
1 
1    * The ability to use octal and hexadecimal constants in 'awk' program
1      source code (⇒Nondecimal-numbers).
1 
11    * The '|&' operator for two-way I/O to a coprocess (⇒Two-way
      I/O).
1 
11    * The '/inet' special files for TCP/IP networking using '|&' (⇒
      TCP/IP Networking).
1 
1    * The optional second argument to 'close()' that allows closing one
1      end of a two-way pipe to a coprocess (⇒Two-way I/O).
1 
1    * The optional third argument to the 'match()' function for capturing
11      text-matching subexpressions within a regexp (⇒String
      Functions).
1 
1    * Positional specifiers in 'printf' formats for making translations
1      easier (⇒Printf Ordering).
1 
1    * A number of new built-in functions:
1 
1         - The 'asort()' and 'asorti()' functions for sorting arrays
1           (⇒Array Sorting).
1 
1         - The 'bindtextdomain()', 'dcgettext()' and 'dcngettext()'
1           functions for internationalization (⇒Programmer i18n).
1 
1         - The 'extension()' function and the ability to add new built-in
1           functions dynamically (⇒Dynamic Extensions).
1 
11         - The 'mktime()' function for creating timestamps (⇒Time
           Functions).
1 
1         - The 'and()', 'or()', 'xor()', 'compl()', 'lshift()',
11           'rshift()', and 'strtonum()' functions (⇒Bitwise
           Functions).
1 
1    * The support for 'next file' as two words was removed completely
1      (⇒Nextfile Statement).
1 
1    * Additional command-line options (⇒Options):
1 
1         - The '--dump-variables' option to print a list of all global
1           variables.
1 
1         - The '--exec' option, for use in CGI scripts.
1 
1         - The '--gen-po' command-line option and the use of a leading
11           underscore to mark strings that should be translated (⇒
           String Extraction).
1 
1         - The '--non-decimal-data' option to allow non-decimal input
1           data (⇒Nondecimal Data).
1 
1         - The '--profile' option and 'pgawk', the profiling version of
1           'gawk', for producing execution profiles of 'awk' programs
1           (⇒Profiling).
1 
1         - The '--use-lc-numeric' option to force 'gawk' to use the
11           locale's decimal point for parsing input data (⇒
           Conversion).
1 
1    * The use of GNU Automake to help in standardizing the configuration
1      process (⇒Quick Installation).
1 
11    * The use of GNU 'gettext' for 'gawk''s own message output (⇒
      Gawk I18N).
1 
1    * BeOS support.  This was later removed.
1 
1    * Tandem support.  This was later removed.
1 
1    * The Atari port became officially unsupported and was later removed
1      entirely.
1 
1    * The source code changed to use ISO C standard-style function
1      definitions.
1 
1    * POSIX compliance for 'sub()' and 'gsub()' (⇒Gory Details).
1 
1    * The 'length()' function was extended to accept an array argument
11      and return the number of elements in the array (⇒String
      Functions).
1 
1    * The 'strftime()' function acquired a third argument to enable
1      printing times as UTC (⇒Time Functions).
1 
1    Version 4.0 of 'gawk' introduced the following features:
1 
1    * Variable additions:
1 
1         - 'FPAT', which allows you to specify a regexp that matches the
11           fields, instead of matching the field separator (⇒
           Splitting By Content).
1 
1         - If 'PROCINFO["sorted_in"]' exists, 'for(iggy in foo)' loops
1           sort the indices before looping over them.  The value of this
1           element provides control over how the indices are sorted
11           before the loop traversal starts (⇒Controlling
           Scanning).
1 
1         - 'PROCINFO["strftime"]', which holds the default format for
1           'strftime()' (⇒Time Functions).
1 
1    * The special files '/dev/pid', '/dev/ppid', '/dev/pgrpid' and
1      '/dev/user' were removed.
1 
1    * Support for IPv6 was added via the '/inet6' special file.  '/inet4'
1      forces IPv4 and '/inet' chooses the system default, which is
1      probably IPv4 (⇒TCP/IP Networking).
1 
1    * The use of '\s' and '\S' escape sequences in regular expressions
1      (⇒GNU Regexp Operators).
1 
1    * Interval expressions became part of default regular expressions
1      (⇒Regexp Operators).
1 
11    * POSIX character classes work even with '--traditional' (⇒
      Regexp Operators).
1 
1    * 'break' and 'continue' became invalid outside a loop, even with
DONTPRINTYET 1      '--traditional' (⇒Break Statement, and also see *note1DONTPRINTYET 1      '--traditional' (⇒Break Statement, and also see ⇒
      Continue Statement).
1 
1    * 'fflush()', 'nextfile', and 'delete ARRAY' are allowed if '--posix'
1      or '--traditional', since they are all now part of POSIX.
1 
1    * An optional third argument to 'asort()' and 'asorti()', specifying
1      how to sort (⇒String Functions).
1 
1    * The behavior of 'fflush()' changed to match BWK 'awk' and for
1      POSIX; now both 'fflush()' and 'fflush("")' flush all open output
1      redirections (⇒I/O Functions).
1 
1    * The 'isarray()' function which distinguishes if an item is an array
11      or not, to make it possible to traverse arrays of arrays (⇒
      Type Functions).
1 
1    * The 'patsplit()' function which gives the same capability as
1      'FPAT', for splitting (⇒String Functions).
1 
1    * An optional fourth argument to the 'split()' function, which is an
11      array to hold the values of the separators (⇒String
      Functions).
1 
1    * Arrays of arrays (⇒Arrays of Arrays).
1 
11    * The 'BEGINFILE' and 'ENDFILE' special patterns (⇒
      BEGINFILE/ENDFILE).
1 
1    * Indirect function calls (⇒Indirect Calls).
1 
11    * 'switch' / 'case' are enabled by default (⇒Switch
      Statement).
1 
1    * Command-line option changes (⇒Options):
1 
1         - The '-b' and '--characters-as-bytes' options which prevent
1           'gawk' from treating input as a multibyte string.
1 
1         - The redundant '--compat', '--copyleft', and '--usage' long
1           options were removed.
1 
1         - The '--gen-po' option was finally renamed to the correct
1           '--gen-pot'.
1 
1         - The '--sandbox' option which disables certain features.
1 
1         - All long options acquired corresponding short options, for use
1           in '#!' scripts.
1 
1    * Directories named on the command line now produce a warning, not a
11      fatal error, unless '--posix' or '--traditional' are used (⇒
      Command-line directories).
1 
1    * The 'gawk' internals were rewritten, bringing the 'dgawk' debugger
1      and possibly improved performance (⇒Debugger).
1 
1    * Per the GNU Coding Standards, dynamic extensions must now define a
11      global symbol indicating that they are GPL-compatible (⇒Plugin
      License).
1 
1    * In POSIX mode, string comparisons use 'strcoll()' / 'wcscoll()'
1      (⇒POSIX String Comparison).
1 
1    * The option for raw sockets was removed, since it was never
1      implemented (⇒TCP/IP Networking).
1 
1    * Ranges of the form '[d-h]' are treated as if they were in the C
1      locale, no matter what kind of regexp is being used, and even if
1      '--posix' (⇒Ranges and Locales).
1 
1    * Support was removed for the following systems:
1 
1         - Atari
1 
1         - Amiga
1 
1         - BeOS
1 
1         - Cray
1 
1         - MIPS RiscOS
1 
1         - MS-DOS with the Microsoft Compiler
1 
1         - MS-Windows with the Microsoft Compiler
1 
1         - NeXT
1 
1         - SunOS 3.x, Sun 386 (Road Runner)
1 
1         - Tandem (non-POSIX)
1 
1         - Prestandard VAX C compiler for VAX/VMS
1 
1    Version 4.1 of 'gawk' introduced the following features:
1 
1    * Three new arrays: 'SYMTAB', 'FUNCTAB', and
1      'PROCINFO["identifiers"]' (⇒Auto-set).
1 
1    * The three executables 'gawk', 'pgawk', and 'dgawk', were merged
1      into one, named just 'gawk'.  As a result the command-line options
1      changed.
1 
1    * Command-line option changes (⇒Options):
1 
1         - The '-D' option invokes the debugger.
1 
1         - The '-i' and '--include' options load 'awk' library files.
1 
1         - The '-l' and '--load' options load compiled dynamic
1           extensions.
1 
1         - The '-M' and '--bignum' options enable MPFR.
1 
1         - The '-o' option only does pretty-printing.
1 
1         - The '-p' option is used for profiling.
1 
1         - The '-R' option was removed.
1 
11    * Support for high precision arithmetic with MPFR (⇒Arbitrary
      Precision Arithmetic).
1 
1    * The 'and()', 'or()' and 'xor()' functions changed to allow any
11      number of arguments, with a minimum of two (⇒Bitwise
      Functions).
1 
11    * The dynamic extension interface was completely redone (⇒
      Dynamic Extensions).
1 
1    * Redirected 'getline' became allowed inside 'BEGINFILE' and
1      'ENDFILE' (⇒BEGINFILE/ENDFILE).
1 
11    * The 'where' command was added to the debugger (⇒Execution
      Stack).
1 
1    * Support for Ultrix was removed.
1 
1    Version 4.2 of 'gawk' introduced the following changes:
1 
1    * Changes to 'ENVIRON' are reflected into 'gawk''s environment and
1      that of programs that it runs.  ⇒Auto-set.
1 
1    * 'FIELDWIDTHS' was enhanced to allow skipping characters before
1      assigning a value to a field (⇒Splitting By Content).
1 
1    * The 'PROCINFO["argv"]' array.  ⇒Auto-set.
1 
1    * The maximum number of hexadecimal digits in '\x' escapes is now
1      two.  ⇒Escape Sequences.
1 
11    * Strongly typed regexp constants of the form '@/.../' (⇒Strong
      Regexp Constants).
1 
1    * The bitwise functions changed, making negative arguments into a
1      fatal error (⇒Bitwise Functions).
1 
1    * The 'mktime()' function now accepts an optional second argument
1      (⇒Time Functions).
1 
1    * The 'typeof()' function (⇒Type Functions).
1 
1    * Optimizations are enabled by default.  Use '-s' / '--no-optimize'
1      to disable optimizations.
1 
1    * For many years, POSIX specified that default field splitting only
1      allowed spaces and tabs to separate fields, and this was how 'gawk'
1      behaved with '--posix'.  As of 2013, the standard restored
1      historical behavior, and now default field splitting with '--posix'
1      also allows newlines to separate fields.
1 
1    * Nonfatal output with 'print' and 'printf'.  ⇒Nonfatal.
1 
11    * Retryable I/O via 'PROCINFO[INPUT-FILE, "RETRY"]'; (⇒Retrying
      Input).
1 
1    * Changes to the pretty-printer (⇒Profiling):
1 
1         - The '--pretty-print' option no longer runs the 'awk' program
1           too.
1 
1         - Comments in the source program are preserved and placed into
1           the output file.
1 
1         - Explicit parentheses for expressions in the input are
1           preserved in the generated output.
1 
1    * Improvements to the extension API (⇒Dynamic Extensions):
1 
1         - The 'get_file()' function to access open redirections.
1 
1         - The 'nonfatal()' function for generating nonfatal error
1           messages.
1 
1         - Support for GMP and MPFR values.
1 
1         - Input parsers can now override the default field parsing
1           mechanism by specifying explicit locations.
1 
1    * Shell startup files are supplied with the distribution and
1      installed by 'make install' (⇒Shell Startup Files).
1 
1    * The 'igawk' program and its manual page are no longer installed
1      when 'gawk' is built.  ⇒Igawk Program.
1 
1    * Support for MirBSD was removed.
1 
1    * Support for GNU/Linux on Alpha was removed.
1