gawk: Group Functions

1 
1 10.6 Reading the Group Database
1 ===============================
1 
1 Much of the discussion presented in ⇒Passwd Functions applies to
1 the group database as well.  Although there has traditionally been a
1 well-known file ('/etc/group') in a well-known format, the POSIX
1 standard only provides a set of C library routines ('<grp.h>' and
1 'getgrent()') for accessing the information.  Even though this file may
1 exist, it may not have complete information.  Therefore, as with the
1 user database, it is necessary to have a small C program that generates
1 the group database as its output.  'grcat', a C program that "cats" the
1 group database, is as follows:
1 
1      /*
1       * grcat.c
1       *
1       * Generate a printable version of the group database.
1       */
1      #include <stdio.h>
1      #include <grp.h>
1 
1      int
1      main(int argc, char **argv)
1      {
1          struct group *g;
1          int i;
1 
1          while ((g = getgrent()) != NULL) {
1              printf("%s:%s:%ld:", g->gr_name, g->gr_passwd,
1                                           (long) g->gr_gid);
1              for (i = 0; g->gr_mem[i] != NULL; i++) {
1                  printf("%s", g->gr_mem[i]);
1                  if (g->gr_mem[i+1] != NULL)
1                      putchar(',');
1              }
1              putchar('\n');
1          }
1          endgrent();
1          return 0;
1      }
1 
1    Each line in the group database represents one group.  The fields are
1 separated with colons and represent the following information:
1 
1 Group Name
1      The group's name.
1 
1 Group Password
1      The group's encrypted password.  In practice, this field is never
1      used; it is usually empty or set to '*'.
1 
1 Group ID Number
1      The group's numeric group ID number; the association of name to
1      number must be unique within the file.  (On some systems it's a C
1      'long', and not an 'int'.  Thus, we cast it to 'long' for all
1      cases.)
1 
1 Group Member List
1      A comma-separated list of usernames.  These users are members of
1      the group.  Modern Unix systems allow users to be members of
1      several groups simultaneously.  If your system does, then there are
1      elements '"group1"' through '"groupN"' in 'PROCINFO' for those
1      group ID numbers.  (Note that 'PROCINFO' is a 'gawk' extension;
1      ⇒Built-in Variables.)
1 
1    Here is what running 'grcat' might produce:
1 
1      $ grcat
1      -| wheel:*:0:arnold
1      -| nogroup:*:65534:
1      -| daemon:*:1:
1      -| kmem:*:2:
1      -| staff:*:10:arnold,miriam,andy
1      -| other:*:20:
1      ...
1 
1    Here are the functions for obtaining information from the group
1 database.  There are several, modeled after the C library functions of
1 the same names:
1 
1      # group.awk --- functions for dealing with the group file
1 
1      BEGIN {
1          # Change to suit your system
1          _gr_awklib = "/usr/local/libexec/awk/"
1      }
1 
1      function _gr_init(    oldfs, oldrs, olddol0, grcat,
1                                   using_fw, using_fpat, n, a, i)
1      {
1          if (_gr_inited)
1              return
1 
1          oldfs = FS
1          oldrs = RS
1          olddol0 = $0
1          using_fw = (PROCINFO["FS"] == "FIELDWIDTHS")
1          using_fpat = (PROCINFO["FS"] == "FPAT")
1          FS = ":"
1          RS = "\n"
1 
1          grcat = _gr_awklib "grcat"
1          while ((grcat | getline) > 0) {
1              if ($1 in _gr_byname)
1                  _gr_byname[$1] = _gr_byname[$1] "," $4
1              else
1                  _gr_byname[$1] = $0
1              if ($3 in _gr_bygid)
1                  _gr_bygid[$3] = _gr_bygid[$3] "," $4
1              else
1                  _gr_bygid[$3] = $0
1 
1              n = split($4, a, "[ \t]*,[ \t]*")
1              for (i = 1; i <= n; i++)
1                  if (a[i] in _gr_groupsbyuser)
1                      _gr_groupsbyuser[a[i]] = _gr_groupsbyuser[a[i]] " " $1
1                  else
1                      _gr_groupsbyuser[a[i]] = $1
1 
1              _gr_bycount[++_gr_count] = $0
1          }
1          close(grcat)
1          _gr_count = 0
1          _gr_inited++
1          FS = oldfs
1          if (using_fw)
1              FIELDWIDTHS = FIELDWIDTHS
1          else if (using_fpat)
1              FPAT = FPAT
1          RS = oldrs
1          $0 = olddol0
1      }
1 
1    The 'BEGIN' rule sets a private variable to the directory where
1 'grcat' is stored.  Because it is used to help out an 'awk' library
1 routine, we have chosen to put it in '/usr/local/libexec/awk'.  You
1 might want it to be in a different directory on your system.
1 
1    These routines follow the same general outline as the user database
1 routines (⇒Passwd Functions).  The '_gr_inited' variable is used
1 to ensure that the database is scanned no more than once.  The
1 '_gr_init()' function first saves 'FS', 'RS', and '$0', and then sets
1 'FS' and 'RS' to the correct values for scanning the group information.
1 It also takes care to note whether 'FIELDWIDTHS' or 'FPAT' is being
1 used, and to restore the appropriate field-splitting mechanism.
1 
1    The group information is stored in several associative arrays.  The
1 arrays are indexed by group name ('_gr_byname'), by group ID number
1 ('_gr_bygid'), and by position in the database ('_gr_bycount').  There
1 is an additional array indexed by username ('_gr_groupsbyuser'), which
1 is a space-separated list of groups to which each user belongs.
1 
1    Unlike in the user database, it is possible to have multiple records
1 in the database for the same group.  This is common when a group has a
1 large number of members.  A pair of such entries might look like the
1 following:
1 
1      tvpeople:*:101:johnny,jay,arsenio
1      tvpeople:*:101:david,conan,tom,joan
1 
1    For this reason, '_gr_init()' looks to see if a group name or group
1 ID number is already seen.  If so, the usernames are simply concatenated
1 onto the previous list of users.(1)
1 
1    Finally, '_gr_init()' closes the pipeline to 'grcat', restores 'FS'
1 (and 'FIELDWIDTHS' or 'FPAT', if necessary), 'RS', and '$0', initializes
1 '_gr_count' to zero (it is used later), and makes '_gr_inited' nonzero.
1 
1    The 'getgrnam()' function takes a group name as its argument, and if
1 that group exists, it is returned.  Otherwise, it relies on the array
1 reference to a nonexistent element to create the element with the null
1 string as its value:
1 
1      function getgrnam(group)
1      {
1          _gr_init()
1          return _gr_byname[group]
1      }
1 
1    The 'getgrgid()' function is similar; it takes a numeric group ID and
1 looks up the information associated with that group ID:
1 
1      function getgrgid(gid)
1      {
1          _gr_init()
1          return _gr_bygid[gid]
1      }
1 
1    The 'getgruser()' function does not have a C counterpart.  It takes a
1 username and returns the list of groups that have the user as a member:
1 
1      function getgruser(user)
1      {
1          _gr_init()
1          return _gr_groupsbyuser[user]
1      }
1 
1    The 'getgrent()' function steps through the database one entry at a
1 time.  It uses '_gr_count' to track its position in the list:
1 
1      function getgrent()
1      {
1          _gr_init()
1          if (++_gr_count in _gr_bycount)
1              return _gr_bycount[_gr_count]
1          return ""
1      }
1 
1    The 'endgrent()' function resets '_gr_count' to zero so that
1 'getgrent()' can start over again:
1 
1      function endgrent()
1      {
1          _gr_count = 0
1      }
1 
1    As with the user database routines, each function calls '_gr_init()'
1 to initialize the arrays.  Doing so only incurs the extra overhead of
1 running 'grcat' if these functions are used (as opposed to moving the
1 body of '_gr_init()' into a 'BEGIN' rule).
1 
1    Most of the work is in scanning the database and building the various
1 associative arrays.  The functions that the user calls are themselves
1 very simple, relying on 'awk''s associative arrays to do work.
1 
1    The 'id' program in ⇒Id Program uses these functions.
1 
1    ---------- Footnotes ----------
1 
1    (1) There is a subtle problem with the code just presented.  Suppose
1 that the first time there were no names.  This code adds the names with
1 a leading comma.  It also doesn't check that there is a '$4'.
1