coreutils: comm invocation
1
1 7.4 ‘comm’: Compare two sorted files line by line
1 =================================================
1
1 ‘comm’ writes to standard output lines that are common, and lines that
1 are unique, to two input files; a file name of ‘-’ means standard input.
1 Synopsis:
1
1 comm [OPTION]... FILE1 FILE2
1
1 Before ‘comm’ can be used, the input files must be sorted using the
1 collating sequence specified by the ‘LC_COLLATE’ locale. If an input
1 file ends in a non-newline character, a newline is silently appended.
1 The ‘sort’ command with no options always outputs a file that is
1 suitable input to ‘comm’.
1
1 With no options, ‘comm’ produces three-column output. Column one
1 contains lines unique to FILE1, column two contains lines unique to
1 FILE2, and column three contains lines common to both files. Columns
1 are separated by a single TAB character.
1
1 The options ‘-1’, ‘-2’, and ‘-3’ suppress printing of the
11 corresponding columns (and separators). Also see ⇒Common
options.
1
1 Unlike some other comparison utilities, ‘comm’ has an exit status
1 that does not depend on the result of the comparison. Upon normal
1 completion ‘comm’ produces an exit code of zero. If there is an error
1 it exits with nonzero status.
1
1 If the ‘--check-order’ option is given, unsorted inputs will cause a
1 fatal error message. If the option ‘--nocheck-order’ is given, unsorted
1 inputs will never cause an error message. If neither of these options
1 is given, wrongly sorted inputs are diagnosed only if an input file is
1 found to contain unpairable lines. If an input file is diagnosed as
1 being unsorted, the ‘comm’ command will exit with a nonzero status (and
1 the output should not be used).
1
1 Forcing ‘comm’ to process wrongly sorted input files containing
1 unpairable lines by specifying ‘--nocheck-order’ is not guaranteed to
1 produce any particular output. The output will probably not correspond
1 with whatever you hoped it would be.
1
1 ‘--check-order’
1 Fail with an error message if either input file is wrongly ordered.
1
1 ‘--nocheck-order’
1 Do not check that both input files are in sorted order.
1
1 Other options are:
1
1 ‘--output-delimiter=STR’
1 Print STR between adjacent output columns, rather than the default
1 of a single TAB character.
1
1 The delimiter STR may not be empty.
1
1 ‘--total’
1 Output a summary at the end.
1
1 Similar to the regular output, column one contains the total number
1 of lines unique to FILE1, column two contains the total number of
1 lines unique to FILE2, and column three contains the total number
1 of lines common to both files, followed by the word ‘total’ in the
1 additional column four.
1
1 In the following example, ‘comm’ omits the regular output (‘-123’),
1 thus just printing the summary:
1
1 $ printf '%s\n' a b c d e > file1
1 $ printf '%s\n' b c d e f g > file2
1 $ comm --total -123 file1 file2
1 1 2 4 total
1
1 This option is a GNU extension. Portable scripts should use ‘wc’
1 to get the totals, e.g. for the above example files:
1
1 $ comm -23 file1 file2 | wc -l # number of lines only in file1
1 1
1 $ comm -13 file1 file2 | wc -l # number of lines only in file2
1 2
1 $ comm -12 file1 file2 | wc -l # number of lines common to both files
1 4
1
1 ‘-z’
1 ‘--zero-terminated’
1 Delimit items with a zero byte rather than a newline (ASCII LF).
1 I.e., treat input as items separated by ASCII NUL and terminate
1 output items with ASCII NUL. This option can be useful in
1 conjunction with ‘perl -0’ or ‘find -print0’ and ‘xargs -0’ which
1 do the same in order to reliably handle arbitrary file names (even
1 those containing blanks or other special characters).
1