sed: Back-references and Subexpressions
1
1 5.7 Back-references and Subexpressions
1 ======================================
1
1 "back-references" are regular expression commands which refer to a
1 previous part of the matched regular expression. Back-references are
1 specified with backslash and a single digit (e.g. '\1'). The part of
1 the regular expression they refer to is called a "subexpression", and is
1 designated with parentheses.
1
1 Back-references and subexpressions are used in two cases: in the
1 regular expression search pattern, and in the REPLACEMENT part of the
1 's' command (⇒Regular Expression Addresses Regexp Addresses. and
1 ⇒The "s" Command).
1
1 In a regular expression pattern, back-references are used to match
1 the same content as a previously matched subexpression. In the
1 following example, the subexpression is '.' - any single character
1 (being surrounded by parentheses makes it a subexpression). The
1 back-reference '\1' asks to match the same content (same character) as
1 the sub-expression.
1
1 The command below matches words starting with any character, followed
1 by the letter 'o', followed by the same character as the first.
1
1 $ sed -E -n '/^(.)o\1$/p' /usr/share/dict/words
1 bob
1 mom
1 non
1 pop
1 sos
1 tot
1 wow
1
1 Multiple subexpressions are automatically numbered from
1 left-to-right. This command searches for 6-letter palindromes (the
1 first three letters are 3 subexpressions, followed by 3 back-references
1 in reverse order):
1
1 $ sed -E -n '/^(.)(.)(.)\3\2\1$/p' /usr/share/dict/words
1 redder
1
1 In the 's' command, back-references can be used in the REPLACEMENT
1 part to refer back to subexpressions in the REGEXP part.
1
1 The following example uses two subexpressions in the regular
1 expression to match two space-separated words. The back-references in
1 the REPLACEMENT part prints the words in a different order:
1
1 $ echo "James Bond" | sed -E 's/(.*) (.*)/The name is \2, \1 \2./'
1 The name is Bond, James Bond.
1
1 When used with alternation, if the group does not participate in the
1 match then the back-reference makes the whole match fail. For example,
1 'a(.)|b\1' will not match 'ba'. When multiple regular expressions are
1 given with '-e' or from a file ('-f FILE'), back-references are local to
1 each expression.
1