m4: Patsubst
1
1 11.6 Substituting text by regular expression
1 ============================================
1
1 Global substitution in a string is done by 'patsubst':
1
1 -- Builtin: patsubst (STRING, REGEXP, [REPLACEMENT])
1 Searches STRING for matches of REGEXP, and substitutes REPLACEMENT
1 for each match. The syntax for regular expressions is the same as
1 in GNU Emacs (⇒Regexp).
1
1 The parts of STRING that are not covered by any match of REGEXP are
1 copied to the expansion. Whenever a match is found, the search
1 proceeds from the end of the match, so a character from STRING will
1 never be substituted twice. If REGEXP matches a string of zero
1 length, the start position for the search is incremented, to avoid
1 infinite loops.
1
1 When a replacement is to be made, REPLACEMENT is inserted into the
1 expansion, with '\N' substituted by the text matched by the Nth
1 parenthesized sub-expression of PATSUBST, for up to nine
1 sub-expressions. The escape '\&' is replaced by the text of the
1 entire regular expression matched. For all other characters, '\'
1 treats the next character literally. A warning is issued if there
1 were fewer sub-expressions than the '\N' requested, or if there is
1 a trailing '\'.
1
1 The REPLACEMENT argument can be omitted, in which case the text
1 matched by REGEXP is deleted.
1
1 The macro 'patsubst' is recognized only with parameters.
1
1 patsubst(`GNUs not Unix', `^', `OBS: ')
1 =>OBS: GNUs not Unix
1 patsubst(`GNUs not Unix', `\<', `OBS: ')
1 =>OBS: GNUs OBS: not OBS: Unix
1 patsubst(`GNUs not Unix', `\w*', `(\&)')
1 =>(GNUs)() (not)() (Unix)()
1 patsubst(`GNUs not Unix', `\w+', `(\&)')
1 =>(GNUs) (not) (Unix)
1 patsubst(`GNUs not Unix', `[A-Z][a-z]+')
1 =>GN not
1 patsubst(`GNUs not Unix', `not', `NOT\')
1 error->m4:stdin:6: Warning: trailing \ ignored in replacement
1 =>GNUs NOT Unix
1
1 Here is a slightly more realistic example, which capitalizes
1 individual words or whole sentences, by substituting calls of the macros
1 'upcase' and 'downcase' into the strings.
1
1 -- Composite: upcase (TEXT)
1 -- Composite: downcase (TEXT)
1 -- Composite: capitalize (TEXT)
1 Expand to TEXT, but with capitalization changed: 'upcase' changes
1 all letters to upper case, 'downcase' changes all letters to lower
1 case, and 'capitalize' changes the first character of each word to
1 upper case and the remaining characters to lower case.
1
1 First, an example of their usage, using implementations distributed
1 in 'm4-1.4.18/examples/capitalize.m4'.
1
1 $ m4 -I examples
1 include(`capitalize.m4')
1 =>
1 upcase(`GNUs not Unix')
1 =>GNUS NOT UNIX
1 downcase(`GNUs not Unix')
1 =>gnus not unix
1 capitalize(`GNUs not Unix')
1 =>Gnus Not Unix
1
1 Now for the implementation. There is a helper macro '_capitalize'
1 which puts only its first word in mixed case. Then 'capitalize' merely
1 parses out the words, and replaces them with an invocation of
1 '_capitalize'. (As presented here, the 'capitalize' macro has some
1 subtle flaws. You should try to see if you can find and correct them;
1 or ⇒Answers Improved capitalize.).
1
1 $ m4 -I examples
1 undivert(`capitalize.m4')dnl
1 =>divert(`-1')
1 =># upcase(text)
1 =># downcase(text)
1 =># capitalize(text)
1 =># change case of text, simple version
1 =>define(`upcase', `translit(`$*', `a-z', `A-Z')')
1 =>define(`downcase', `translit(`$*', `A-Z', `a-z')')
1 =>define(`_capitalize',
1 => `regexp(`$1', `^\(\w\)\(\w*\)',
1 => `upcase(`\1')`'downcase(`\2')')')
1 =>define(`capitalize', `patsubst(`$1', `\w+', `_$0(`\&')')')
1 =>divert`'dnl
1
1 While 'regexp' replaces the whole input with the replacement as soon
1 as there is a match, 'patsubst' replaces each _occurrence_ of a match
1 and preserves non-matching pieces:
1
1 define(`patreg',
1 `patsubst($@)
1 regexp($@)')dnl
1 patreg(`bar foo baz Foo', `foo\|Foo', `FOO')
1 =>bar FOO baz FOO
1 =>FOO
1 patreg(`aba abb 121', `\(.\)\(.\)\1', `\2\1\2')
1 =>bab abb 212
1 =>bab
1
1 Omitting REGEXP evokes a warning, but still produces output; contrast
1 this with an empty REGEXP argument.
1
1 patsubst(`abc')
1 error->m4:stdin:1: Warning: too few arguments to builtin `patsubst'
1 =>abc
1 patsubst(`abc', `')
1 =>abc
1 patsubst(`abc', `', `\\-')
1 =>\-a\-b\-c\-
1