gawk: Time Functions

1 
1 9.1.5 Time Functions
1 --------------------
1 
1 'awk' programs are commonly used to process log files containing
1 timestamp information, indicating when a particular log record was
1 written.  Many programs log their timestamps in the form returned by the
1 'time()' system call, which is the number of seconds since a particular
1 epoch.  On POSIX-compliant systems, it is the number of seconds since
1 1970-01-01 00:00:00 UTC, not counting leap seconds.(1)  All known
1 POSIX-compliant systems support timestamps from 0 through 2^31 - 1,
1 which is sufficient to represent times through 2038-01-19 03:14:07 UTC.
1 Many systems support a wider range of timestamps, including negative
1 timestamps that represent times before the epoch.
1 
1    In order to make it easier to process such log files and to produce
1 useful reports, 'gawk' provides the following functions for working with
1 timestamps.  They are 'gawk' extensions; they are not specified in the
11 POSIX standard.(2)  However, recent versions of 'mawk' (⇒Other
 Versions) also support these functions.  Optional parameters are
1 enclosed in square brackets ([ ]):
1 
1 'mktime(DATESPEC' [', UTC-FLAG' ]')'
1      Turn DATESPEC into a timestamp in the same form as is returned by
1      'systime()'.  It is similar to the function of the same name in ISO
1      C. The argument, DATESPEC, is a string of the form
1      '"YYYY MM DD HH MM SS [DST]"'.  The string consists of six or seven
1      numbers representing, respectively, the full year including
1      century, the month from 1 to 12, the day of the month from 1 to 31,
1      the hour of the day from 0 to 23, the minute from 0 to 59, the
1      second from 0 to 60,(3) and an optional daylight-savings flag.
1 
1      The values of these numbers need not be within the ranges
1      specified; for example, an hour of -1 means 1 hour before midnight.
1      The origin-zero Gregorian calendar is assumed, with year 0
1      preceding year 1 and year -1 preceding year 0.  If UTC-FLAG is
1      present and is either nonzero or non-null, the time is assumed to
1      be in the UTC time zone; otherwise, the time is assumed to be in
1      the local time zone.  If the DST daylight-savings flag is positive,
1      the time is assumed to be daylight savings time; if zero, the time
1      is assumed to be standard time; and if negative (the default),
1      'mktime()' attempts to determine whether daylight savings time is
1      in effect for the specified time.
1 
1      If DATESPEC does not contain enough elements or if the resulting
1      time is out of range, 'mktime()' returns -1.
1 
1 'strftime('[FORMAT [',' TIMESTAMP [',' UTC-FLAG] ] ]')'
1      Format the time specified by TIMESTAMP based on the contents of the
1      FORMAT string and return the result.  It is similar to the function
1      of the same name in ISO C. If UTC-FLAG is present and is either
1      nonzero or non-null, the value is formatted as UTC (Coordinated
1      Universal Time, formerly GMT or Greenwich Mean Time).  Otherwise,
1      the value is formatted for the local time zone.  The TIMESTAMP is
1      in the same format as the value returned by the 'systime()'
1      function.  If no TIMESTAMP argument is supplied, 'gawk' uses the
1      current time of day as the timestamp.  Without a FORMAT argument,
1      'strftime()' uses the value of 'PROCINFO["strftime"]' as the format
1      string (⇒Built-in Variables).  The default string value is
1      '"%a %b %e %H:%M:%S %Z %Y"'.  This format string produces output
1      that is equivalent to that of the 'date' utility.  You can assign a
1      new value to 'PROCINFO["strftime"]' to change the default format;
1      see the following list for the various format directives.
1 
1 'systime()'
1      Return the current time as the number of seconds since the system
1      epoch.  On POSIX systems, this is the number of seconds since
1      1970-01-01 00:00:00 UTC, not counting leap seconds.  It may be a
1      different number on other systems.
1 
1    The 'systime()' function allows you to compare a timestamp from a log
1 file with the current time of day.  In particular, it is easy to
1 determine how long ago a particular record was logged.  It also allows
1 you to produce log records using the "seconds since the epoch" format.
1 
1    The 'mktime()' function allows you to convert a textual
1 representation of a date and time into a timestamp.  This makes it easy
1 to do before/after comparisons of dates and times, particularly when
1 dealing with date and time data coming from an external source, such as
1 a log file.
1 
1    The 'strftime()' function allows you to easily turn a timestamp into
1 human-readable information.  It is similar in nature to the 'sprintf()'
1 function (⇒String Functions), in that it copies nonformat
1 specification characters verbatim to the returned string, while
1 substituting date and time values for format specifications in the
1 FORMAT string.
1 
1    'strftime()' is guaranteed by the 1999 ISO C standard(4) to support
1 the following date format specifications:
1 
1 '%a'
1      The locale's abbreviated weekday name.
1 
1 '%A'
1      The locale's full weekday name.
1 
1 '%b'
1      The locale's abbreviated month name.
1 
1 '%B'
1      The locale's full month name.
1 
1 '%c'
1      The locale's "appropriate" date and time representation.  (This is
1      '%A %B %d %T %Y' in the '"C"' locale.)
1 
1 '%C'
1      The century part of the current year.  This is the year divided by
1      100 and truncated to the next lower integer.
1 
1 '%d'
1      The day of the month as a decimal number (01-31).
1 
1 '%D'
1      Equivalent to specifying '%m/%d/%y'.
1 
1 '%e'
1      The day of the month, padded with a space if it is only one digit.
1 
1 '%F'
1      Equivalent to specifying '%Y-%m-%d'.  This is the ISO 8601 date
1      format.
1 
1 '%g'
1      The year modulo 100 of the ISO 8601 week number, as a decimal
1      number (00-99).  For example, January 1, 2012, is in week 53 of
1      2011.  Thus, the year of its ISO 8601 week number is 2011, even
1      though its year is 2012.  Similarly, December 31, 2012, is in week
1      1 of 2013.  Thus, the year of its ISO week number is 2013, even
1      though its year is 2012.
1 
1 '%G'
1      The full year of the ISO week number, as a decimal number.
1 
1 '%h'
1      Equivalent to '%b'.
1 
1 '%H'
1      The hour (24-hour clock) as a decimal number (00-23).
1 
1 '%I'
1      The hour (12-hour clock) as a decimal number (01-12).
1 
1 '%j'
1      The day of the year as a decimal number (001-366).
1 
1 '%m'
1      The month as a decimal number (01-12).
1 
1 '%M'
1      The minute as a decimal number (00-59).
1 
1 '%n'
1      A newline character (ASCII LF).
1 
1 '%p'
1      The locale's equivalent of the AM/PM designations associated with a
1      12-hour clock.
1 
1 '%r'
1      The locale's 12-hour clock time.  (This is '%I:%M:%S %p' in the
1      '"C"' locale.)
1 
1 '%R'
1      Equivalent to specifying '%H:%M'.
1 
1 '%S'
1      The second as a decimal number (00-60).
1 
1 '%t'
1      A TAB character.
1 
1 '%T'
1      Equivalent to specifying '%H:%M:%S'.
1 
1 '%u'
1      The weekday as a decimal number (1-7).  Monday is day one.
1 
1 '%U'
1      The week number of the year (with the first Sunday as the first day
1      of week one) as a decimal number (00-53).
1 
1 '%V'
1      The week number of the year (with the first Monday as the first day
1      of week one) as a decimal number (01-53).  The method for
1      determining the week number is as specified by ISO 8601.  (To wit:
1      if the week containing January 1 has four or more days in the new
1      year, then it is week one; otherwise it is the last week [52 or 53]
1      of the previous year and the next week is week one.)
1 
1 '%w'
1      The weekday as a decimal number (0-6).  Sunday is day zero.
1 
1 '%W'
1      The week number of the year (with the first Monday as the first day
1      of week one) as a decimal number (00-53).
1 
1 '%x'
1      The locale's "appropriate" date representation.  (This is '%A %B %d
1      %Y' in the '"C"' locale.)
1 
1 '%X'
1      The locale's "appropriate" time representation.  (This is '%T' in
1      the '"C"' locale.)
1 
1 '%y'
1      The year modulo 100 as a decimal number (00-99).
1 
1 '%Y'
1      The full year as a decimal number (e.g., 2015).
1 
1 '%z'
1      The time zone offset in a '+HHMM' format (e.g., the format
1      necessary to produce RFC 822/RFC 1036 date headers).
1 
1 '%Z'
1      The time zone name or abbreviation; no characters if no time zone
1      is determinable.
1 
1 '%Ec %EC %Ex %EX %Ey %EY %Od %Oe %OH'
1 '%OI %Om %OM %OS %Ou %OU %OV %Ow %OW %Oy'
1      "Alternative representations" for the specifications that use only
1      the second letter ('%c', '%C', and so on).(5)  (These facilitate
1      compliance with the POSIX 'date' utility.)
1 
1 '%%'
1      A literal '%'.
1 
1    If a conversion specifier is not one of those just listed, the
1 behavior is undefined.(6)
1 
1    For systems that are not yet fully standards-compliant, 'gawk'
1 supplies a copy of 'strftime()' from the GNU C Library.  It supports all
1 of the just-listed format specifications.  If that version is used to
1 compile 'gawk' (⇒Installation), then the following additional
1 format specifications are available:
1 
1 '%k'
1      The hour (24-hour clock) as a decimal number (0-23).  Single-digit
1      numbers are padded with a space.
1 
1 '%l'
1      The hour (12-hour clock) as a decimal number (1-12).  Single-digit
1      numbers are padded with a space.
1 
1 '%s'
1      The time as a decimal timestamp in seconds since the epoch.
1 
1    Additionally, the alternative representations are recognized but
1 their normal representations are used.
1 
1    The following example is an 'awk' implementation of the POSIX 'date'
1 utility.  Normally, the 'date' utility prints the current date and time
1 of day in a well-known format.  However, if you provide an argument to
1 it that begins with a '+', 'date' copies nonformat specifier characters
1 to the standard output and interprets the current time according to the
1 format specifiers in the string.  For example:
1 
1      $ date '+Today is %A, %B %d, %Y.'
1      -| Today is Monday, September 22, 2014.
1 
1    Here is the 'gawk' version of the 'date' utility.  It has a shell
1 "wrapper" to handle the '-u' option, which requires that 'date' run as
1 if the time zone is set to UTC:
1 
1      #! /bin/sh
1      #
1      # date --- approximate the POSIX 'date' command
1 
1      case $1 in
1      -u)  TZ=UTC0     # use UTC
1           export TZ
1           shift ;;
1      esac
1 
1      gawk 'BEGIN  {
1          format = PROCINFO["strftime"]
1          exitval = 0
1 
1          if (ARGC > 2)
1              exitval = 1
1          else if (ARGC == 2) {
1              format = ARGV[1]
1              if (format ~ /^\+/)
1                  format = substr(format, 2)   # remove leading +
1          }
1          print strftime(format)
1          exit exitval
1      }' "$@"
1 
1    ---------- Footnotes ----------
1 
1    (1) ⇒Glossary, especially the entries "Epoch" and "UTC."
1 
1    (2) The GNU 'date' utility can also do many of the things described
1 here.  Its use may be preferable for simple time-related operations in
1 shell scripts.
1 
1    (3) Occasionally there are minutes in a year with a leap second,
1 which is why the seconds can go up to 60.
1 
1    (4) Unfortunately, not every system's 'strftime()' necessarily
1 supports all of the conversions listed here.
1 
1    (5) If you don't understand any of this, don't worry about it; these
1 facilities are meant to make it easier to "internationalize" programs.
11 Other internationalization features are described in ⇒
 Internationalization.
1 
1    (6) This is because ISO C leaves the behavior of the C version of
1 'strftime()' undefined and 'gawk' uses the system's version of
1 'strftime()' if it's there.  Typically, the conversion specifier either
1 does not appear in the returned string or appears literally.
1