tar: controlling pattern-matching
1
1 Controlling Pattern-Matching
1 ----------------------------
1
1 For the purposes of this section, we call "exclusion members" all member
1 names obtained while processing '--exclude' and '--exclude-from'
1 options, and "inclusion members" those member names that were given in
1 the command line or read from the file specified with '--files-from'
1 option.
1
1 These two pairs of member lists are used in the following operations:
1 '--diff', '--extract', '--list', '--update'.
1
1 There are no inclusion members in create mode ('--create' and
1 '--append'), since in this mode the names obtained from the command line
1 refer to _files_, not archive members.
1
1 By default, inclusion members are compared with archive members
1 literally (1) and exclusion members are treated as globbing patterns.
1 For example:
1
1 $ tar tf foo.tar
1 a.c
1 b.c
1 a.txt
1 [remarks]
1 # Member names are used verbatim:
1 $ tar -xf foo.tar -v '[remarks]'
1 [remarks]
1 # Exclude member names are globbed:
1 $ tar -xf foo.tar -v --exclude '*.c'
1 a.txt
1 [remarks]
1
1 This behavior can be altered by using the following options:
1
1 '--wildcards'
1 Treat all member names as wildcards.
1
1 '--no-wildcards'
1 Treat all member names as literal strings.
1
1 Thus, to extract files whose names end in '.c', you can use:
1
1 $ tar -xf foo.tar -v --wildcards '*.c'
1 a.c
1 b.c
1
1 Notice quoting of the pattern to prevent the shell from interpreting it.
1
1 The effect of '--wildcards' option is canceled by '--no-wildcards'.
1 This can be used to pass part of the command line arguments verbatim and
1 other part as globbing patterns. For example, the following invocation:
1
1 $ tar -xf foo.tar --wildcards '*.txt' --no-wildcards '[remarks]'
1
1 instructs 'tar' to extract from 'foo.tar' all files whose names end in
1 '.txt' and the file named '[remarks]'.
1
1 Normally, a pattern matches a name if an initial subsequence of the
1 name's components matches the pattern, where '*', '?', and '[...]' are
1 the usual shell wildcards, '\' escapes wildcards, and wildcards can
1 match '/'.
1
11 Other than optionally stripping leading '/' from names (⇒
absolute), patterns and names are used as-is. For example, trailing
1 '/' is not trimmed from a user-specified name before deciding whether to
1 exclude it.
1
1 However, this matching procedure can be altered by the options listed
1 below. These options accumulate. For example:
1
1 --ignore-case --exclude='makefile' --no-ignore-case ---exclude='readme'
1
1 ignores case when excluding 'makefile', but not when excluding 'readme'.
1
1 '--anchored'
1 '--no-anchored'
1 If anchored, a pattern must match an initial subsequence of the
1 name's components. Otherwise, the pattern can match any
1 subsequence. Default is '--no-anchored' for exclusion members and
1 '--anchored' inclusion members.
1
1 '--ignore-case'
1 '--no-ignore-case'
1 When ignoring case, upper-case patterns match lower-case names and
1 vice versa. When not ignoring case (the default), matching is
1 case-sensitive.
1
1 '--wildcards-match-slash'
1 '--no-wildcards-match-slash'
1 When wildcards match slash (the default for exclusion members), a
1 wildcard like '*' in the pattern can match a '/' in the name.
1 Otherwise, '/' is matched only by '/'.
1
1 The '--recursion' and '--no-recursion' options (⇒recurse) also
1 affect how member patterns are interpreted. If recursion is in effect,
1 a pattern matches a name if it matches any of the name's parent
1 directories.
1
1 The following table summarizes pattern-matching default values:
1
1 Members Default settings
1 --------------------------------------------------------------------------
1 Inclusion '--wildcards --anchored --wildcards-match-slash'
1 Exclusion '--wildcards --no-anchored
1 --wildcards-match-slash'
1
1 Wildcard matching confusion
1 ...........................
1
1 Using of '--[no-]anchored' and '--[no-]wildcards-match-slash' was proven
1 to make confusion. The reasons for this are probably different default
1 setting for inclusion and exclusion patterns (in general: you shouldn't
1 rely on defaults if possible) and maybe also because when using any of
1 these two options, the position on command line matters (these options
1 should be placed prior to the member name on command line).
1
1 Consider following directory structure:
1
1 $ find path/ | sort
1 path/
1 path/file1
1 path/file2
1 path/subpath
1 path/subpath/file1
1 path/subpath/file2
1 path/subpath2
1 path/subpath2/file1
1 path/subpath2/file2
1
1 To archive full directory 'path' except all files named 'file1' may be
1 reached by any of the two following commands:
1
1 $ tar -cf a.tar --no-wildcards-match-slash --no-anchored path \
1 --exclude='*/file1'
1 $ tar -cf a.tar --wildcards-match-slash path --exclude='*/file1'
1
1 Note that the '--wildcards-match-slash' and '--no-anchored' may be
1 omitted as it is default for '--exclude'. Anyway, we usually want just
1 concrete file (or rather subset of files with the same name). Assume we
1 want exclude only files named 'file1' from the first subdirectory level.
1 Following command obviously does not work (it still excludes all files
1 having 'file1' name):
1
1 $ tar -cf a.tar --no-wildcards-match-slash path \
1 --exclude='*/file1' | sort
1
1 This is because the '--no-anchored' is set by default for exclusion.
1 What you need to fix is to put '--anchored' before pathname:
1
1 $ tar -cvf a.tar --no-wildcards-match-slash --anchored path \
1 --exclude='*/file1' | sort
1 path/
1 path/file2
1 path/subpath1/
1 path/subpath1/file1
1 path/subpath1/file2
1 path/subpath2/
1 path/subpath2/file1
1 path/subpath2/file2
1
1 Similarly you can exclude second level by specifying '*/*/file1'.
1
1 ---------- Footnotes ----------
1
1 (1) Notice that earlier GNU 'tar' versions used globbing for
1 inclusion members, which contradicted to UNIX98 specification and was
1 not documented. ⇒Changes, for more information on this and other
1 changes.
1