gawk: Group Functions
1
1 10.6 Reading the Group Database
1 ===============================
1
1 Much of the discussion presented in ⇒Passwd Functions applies to
1 the group database as well. Although there has traditionally been a
1 well-known file ('/etc/group') in a well-known format, the POSIX
1 standard only provides a set of C library routines ('<grp.h>' and
1 'getgrent()') for accessing the information. Even though this file may
1 exist, it may not have complete information. Therefore, as with the
1 user database, it is necessary to have a small C program that generates
1 the group database as its output. 'grcat', a C program that "cats" the
1 group database, is as follows:
1
1 /*
1 * grcat.c
1 *
1 * Generate a printable version of the group database.
1 */
1 #include <stdio.h>
1 #include <grp.h>
1
1 int
1 main(int argc, char **argv)
1 {
1 struct group *g;
1 int i;
1
1 while ((g = getgrent()) != NULL) {
1 printf("%s:%s:%ld:", g->gr_name, g->gr_passwd,
1 (long) g->gr_gid);
1 for (i = 0; g->gr_mem[i] != NULL; i++) {
1 printf("%s", g->gr_mem[i]);
1 if (g->gr_mem[i+1] != NULL)
1 putchar(',');
1 }
1 putchar('\n');
1 }
1 endgrent();
1 return 0;
1 }
1
1 Each line in the group database represents one group. The fields are
1 separated with colons and represent the following information:
1
1 Group Name
1 The group's name.
1
1 Group Password
1 The group's encrypted password. In practice, this field is never
1 used; it is usually empty or set to '*'.
1
1 Group ID Number
1 The group's numeric group ID number; the association of name to
1 number must be unique within the file. (On some systems it's a C
1 'long', and not an 'int'. Thus, we cast it to 'long' for all
1 cases.)
1
1 Group Member List
1 A comma-separated list of usernames. These users are members of
1 the group. Modern Unix systems allow users to be members of
1 several groups simultaneously. If your system does, then there are
1 elements '"group1"' through '"groupN"' in 'PROCINFO' for those
1 group ID numbers. (Note that 'PROCINFO' is a 'gawk' extension;
1 ⇒Built-in Variables.)
1
1 Here is what running 'grcat' might produce:
1
1 $ grcat
1 -| wheel:*:0:arnold
1 -| nogroup:*:65534:
1 -| daemon:*:1:
1 -| kmem:*:2:
1 -| staff:*:10:arnold,miriam,andy
1 -| other:*:20:
1 ...
1
1 Here are the functions for obtaining information from the group
1 database. There are several, modeled after the C library functions of
1 the same names:
1
1 # group.awk --- functions for dealing with the group file
1
1 BEGIN {
1 # Change to suit your system
1 _gr_awklib = "/usr/local/libexec/awk/"
1 }
1
1 function _gr_init( oldfs, oldrs, olddol0, grcat,
1 using_fw, using_fpat, n, a, i)
1 {
1 if (_gr_inited)
1 return
1
1 oldfs = FS
1 oldrs = RS
1 olddol0 = $0
1 using_fw = (PROCINFO["FS"] == "FIELDWIDTHS")
1 using_fpat = (PROCINFO["FS"] == "FPAT")
1 FS = ":"
1 RS = "\n"
1
1 grcat = _gr_awklib "grcat"
1 while ((grcat | getline) > 0) {
1 if ($1 in _gr_byname)
1 _gr_byname[$1] = _gr_byname[$1] "," $4
1 else
1 _gr_byname[$1] = $0
1 if ($3 in _gr_bygid)
1 _gr_bygid[$3] = _gr_bygid[$3] "," $4
1 else
1 _gr_bygid[$3] = $0
1
1 n = split($4, a, "[ \t]*,[ \t]*")
1 for (i = 1; i <= n; i++)
1 if (a[i] in _gr_groupsbyuser)
1 _gr_groupsbyuser[a[i]] = _gr_groupsbyuser[a[i]] " " $1
1 else
1 _gr_groupsbyuser[a[i]] = $1
1
1 _gr_bycount[++_gr_count] = $0
1 }
1 close(grcat)
1 _gr_count = 0
1 _gr_inited++
1 FS = oldfs
1 if (using_fw)
1 FIELDWIDTHS = FIELDWIDTHS
1 else if (using_fpat)
1 FPAT = FPAT
1 RS = oldrs
1 $0 = olddol0
1 }
1
1 The 'BEGIN' rule sets a private variable to the directory where
1 'grcat' is stored. Because it is used to help out an 'awk' library
1 routine, we have chosen to put it in '/usr/local/libexec/awk'. You
1 might want it to be in a different directory on your system.
1
1 These routines follow the same general outline as the user database
1 routines (⇒Passwd Functions). The '_gr_inited' variable is used
1 to ensure that the database is scanned no more than once. The
1 '_gr_init()' function first saves 'FS', 'RS', and '$0', and then sets
1 'FS' and 'RS' to the correct values for scanning the group information.
1 It also takes care to note whether 'FIELDWIDTHS' or 'FPAT' is being
1 used, and to restore the appropriate field-splitting mechanism.
1
1 The group information is stored in several associative arrays. The
1 arrays are indexed by group name ('_gr_byname'), by group ID number
1 ('_gr_bygid'), and by position in the database ('_gr_bycount'). There
1 is an additional array indexed by username ('_gr_groupsbyuser'), which
1 is a space-separated list of groups to which each user belongs.
1
1 Unlike in the user database, it is possible to have multiple records
1 in the database for the same group. This is common when a group has a
1 large number of members. A pair of such entries might look like the
1 following:
1
1 tvpeople:*:101:johnny,jay,arsenio
1 tvpeople:*:101:david,conan,tom,joan
1
1 For this reason, '_gr_init()' looks to see if a group name or group
1 ID number is already seen. If so, the usernames are simply concatenated
1 onto the previous list of users.(1)
1
1 Finally, '_gr_init()' closes the pipeline to 'grcat', restores 'FS'
1 (and 'FIELDWIDTHS' or 'FPAT', if necessary), 'RS', and '$0', initializes
1 '_gr_count' to zero (it is used later), and makes '_gr_inited' nonzero.
1
1 The 'getgrnam()' function takes a group name as its argument, and if
1 that group exists, it is returned. Otherwise, it relies on the array
1 reference to a nonexistent element to create the element with the null
1 string as its value:
1
1 function getgrnam(group)
1 {
1 _gr_init()
1 return _gr_byname[group]
1 }
1
1 The 'getgrgid()' function is similar; it takes a numeric group ID and
1 looks up the information associated with that group ID:
1
1 function getgrgid(gid)
1 {
1 _gr_init()
1 return _gr_bygid[gid]
1 }
1
1 The 'getgruser()' function does not have a C counterpart. It takes a
1 username and returns the list of groups that have the user as a member:
1
1 function getgruser(user)
1 {
1 _gr_init()
1 return _gr_groupsbyuser[user]
1 }
1
1 The 'getgrent()' function steps through the database one entry at a
1 time. It uses '_gr_count' to track its position in the list:
1
1 function getgrent()
1 {
1 _gr_init()
1 if (++_gr_count in _gr_bycount)
1 return _gr_bycount[_gr_count]
1 return ""
1 }
1
1 The 'endgrent()' function resets '_gr_count' to zero so that
1 'getgrent()' can start over again:
1
1 function endgrent()
1 {
1 _gr_count = 0
1 }
1
1 As with the user database routines, each function calls '_gr_init()'
1 to initialize the arrays. Doing so only incurs the extra overhead of
1 running 'grcat' if these functions are used (as opposed to moving the
1 body of '_gr_init()' into a 'BEGIN' rule).
1
1 Most of the work is in scanning the database and building the various
1 associative arrays. The functions that the user calls are themselves
1 very simple, relying on 'awk''s associative arrays to do work.
1
1 The 'id' program in ⇒Id Program uses these functions.
1
1 ---------- Footnotes ----------
1
1 (1) There is a subtle problem with the code just presented. Suppose
1 that the first time there were no names. This code adds the names with
1 a leading comma. It also doesn't check that there is a '$4'.
1