ed: Regular expressions
1
1 5 Regular expressions
1 *********************
1
1 Regular expressions are patterns used in selecting text. For example,
1 the 'ed' command
1
1 g/STRING/
1
1 prints all lines containing STRING. Regular expressions are also used
1 by the 's' command for selecting old text to be replaced with new text.
1
1 In addition to a specifying string literals, regular expressions can
1 represent classes of strings. Strings thus represented are said to be
1 matched by the corresponding regular expression. If it is possible for a
1 regular expression to match several strings in a line, then the
1 left-most longest match is the one selected.
1
1 The following symbols are used in constructing regular expressions:
1
1 'C'
1 Any character C not listed below, including '{', '}', '(', ')',
1 '<' and '>', matches itself.
1
1 '\C'
1 Any backslash-escaped character C, other than '{', '}', '(', ')',
1 '<', '>', 'b', 'B', 'w', 'W', '+' and '?', matches itself.
1
1 '.'
1 Matches any single character.
1
1 '[CHAR-CLASS]'
1 Matches any single character in CHAR-CLASS. To include a ']' in
1 CHAR-CLASS, it must be the first character. A range of characters
1 may be specified by separating the end characters of the range
1 with a '-', e.g., 'a-z' specifies the lower case characters. The
1 following literal expressions can also be used in CHAR-CLASS to
1 specify sets of characters:
1
1 [:alnum:] [:cntrl:] [:lower:] [:space:]
1 [:alpha:] [:digit:] [:print:] [:upper:]
1 [:blank:] [:graph:] [:punct:] [:xdigit:]
1
1 If '-' appears as the first or last character of CHAR-CLASS, then
1 it matches itself. All other characters in CHAR-CLASS match
1 themselves.
1
1 Patterns in CHAR-CLASS of the form:
1 [.COL-ELM.]
1 [=COL-ELM=]
1
1 where COL-ELM is a "collating element" are interpreted according
1 to 'locale (5)'. See 'regex (3)' for an explanation of these
1 constructs.
1
1 '[^CHAR-CLASS]'
1 Matches any single character, other than newline, not in
1 CHAR-CLASS. CHAR-CLASS is defined as above.
1
1 '^'
1 If '^' is the first character of a regular expression, then it
1 anchors the regular expression to the beginning of a line.
1 Otherwise, it matches itself.
1
1 '$'
1 If '$' is the last character of a regular expression, it anchors
1 the regular expression to the end of a line. Otherwise, it matches
1 itself.
1
1 '\(RE\)'
1 Defines a (possibly null) subexpression RE. Subexpressions may be
1 nested. A subsequent backreference of the form '\N', where N is a
1 number in the range [1,9], expands to the text matched by the Nth
1 subexpression. For example, the regular expression '\(a.c\)\1'
1 matches the string 'abcabc', but not 'abcadc'. Subexpressions are
1 ordered relative to their left delimiter.
1
1 '*'
1 Matches the single character regular expression or subexpression
1 immediately preceding it zero or more times. If '*' is the first
1 character of a regular expression or subexpression, then it matches
1 itself. The '*' operator sometimes yields unexpected results. For
1 example, the regular expression 'b*' matches the beginning of the
1 string 'abbb', as opposed to the substring 'bbb', since a null
1 match is the only left-most match.
1
1 '\{N,M\}'
1 '\{N,\}'
1 '\{N\}'
1 Matches the single character regular expression or subexpression
1 immediately preceding it at least N and at most M times. If M is
1 omitted, then it matches at least N times. If the comma is also
1 omitted, then it matches exactly N times. If any of these forms
1 occurs first in a regular expression or subexpression, then it is
1 interpreted literally (i.e., the regular expression '\{2\}'
1 matches the string '{2}', and so on).
1
1 '\<'
1 '\>'
1 Anchors the single character regular expression or subexpression
1 immediately following it to the beginning (in the case of '\<') or
1 ending (in the case of '\>') of a "word", i.e., in ASCII, a
1 maximal string of alphanumeric characters, including the
1 underscore (_).
1
1
1 The following extended operators are preceded by a backslash '\' to
1 distinguish them from traditional 'ed' syntax.
1
1 '\`'
1 '\''
1 Unconditionally matches the beginning '\`' or ending '\'' of a
1 line.
1
1 '\?'
1 Optionally matches the single character regular expression or
1 subexpression immediately preceding it. For example, the regular
1 expression 'a[bd]\?c' matches the strings 'abc', 'adc' and 'ac'.
1 If '\?' occurs at the beginning of a regular expressions or
1 subexpression, then it matches a literal '?'.
1
1 '\+'
1 Matches the single character regular expression or subexpression
1 immediately preceding it one or more times. So the regular
1 expression 'a+' is shorthand for 'aa*'. If '\+' occurs at the
1 beginning of a regular expression or subexpression, then it
1 matches a literal '+'.
1
1 '\b'
1 Matches the beginning or ending (null string) of a word. Thus the
1 regular expression '\bhello\b' is equivalent to '\<hello\>'.
1 However, '\b\b' is a valid regular expression whereas '\<\>' is
1 not.
1
1 '\B'
1 Matches (a null string) inside a word.
1
1 '\w'
1 Matches any character in a word.
1
1 '\W'
1 Matches any character not in a word.
1
1