gawk: Strong Regexp Constants
1
1 6.1.2.2 Strongly Typed Regexp Constants
1 .......................................
1
1 This minor node describes a 'gawk'-specific feature.
1
1 As we saw in the previous minor node, regexp constants ('/.../') hold
1 a strange position in the 'awk' language. In most contexts, they act
1 like an expression: '$0 ~ /.../'. In other contexts, they denote only a
1 regexp to be matched. In no case are they really a "first class
1 citizen" of the language. That is, you cannot define a scalar variable
1 whose type is "regexp" in the same sense that you can define a variable
1 to be a number or a string:
1
1 num = 42 Numeric variable
1 str = "hi" String variable
1 re = /foo/ Wrong! re is the result of $0 ~ /foo/
1
1 For a number of more advanced use cases, it would be nice to have
1 regexp constants that are "strongly typed"; in other words, that denote
1 a regexp useful for matching, and not an expression.
1
1 'gawk' provides this feature. A strongly typed regexp constant looks
1 almost like a regular regexp constant, except that it is preceded by an
1 '@' sign:
1
1 re = @/foo/ Regexp variable
1
1 Strongly typed regexp constants _cannot_ be used everywhere that a
1 regular regexp constant can, because this would make the language even
1 more confusing. Instead, you may use them only in certain contexts:
1
1 * On the righthand side of the '~' and '!~' operators: 'some_var ~
1 @/foo/' (⇒Regexp Usage).
1
11 * In the 'case' part of a 'switch' statement (⇒Switch
Statement).
1
1 * As an argument to one of the built-in functions that accept regexp
1 constants: 'gensub()', 'gsub()', 'match()', 'patsplit()',
1 'split()', and 'sub()' (⇒String Functions).
1
11 * As a parameter in a call to a user-defined function (⇒
User-defined).
1
1 * On the righthand side of an assignment to a variable: 'some_var =
1 @/foo/'. In this case, the type of 'some_var' is regexp.
1 Additionally, 'some_var' can be used with '~' and '!~', passed to
1 one of the built-in functions listed above, or passed as a
1 parameter to a user-defined function.
1
1 You may use the 'typeof()' built-in function (⇒Type Functions)
1 to determine if a variable or function parameter is a regexp variable.
1
1 The true power of this feature comes from the ability to create
1 variables that have regexp type. Such variables can be passed on to
1 user-defined functions, without the confusing aspects of computed
1 regular expressions created from strings or string constants. They may
1 also be passed through indirect function calls (⇒Indirect Calls)
1 and on to the built-in functions that accept regexp constants.
1
1 When used in numeric conversions, strongly typed regexp variables
1 convert to zero. When used in string conversions, they convert to the
1 string value of the original regexp text.
1