m4: Improved foreach
1
1 17.3 Solution for 'foreach'
1 ===========================
1
1 The 'foreach' and 'foreachq' macros (⇒Foreach) as presented
1 earlier each have flaws. First, we will examine and fix the quadratic
1 behavior of 'foreachq':
1
1 $ m4 -I examples
1 include(`foreachq.m4')
1 =>
1 traceon(`shift')debugmode(`aq')
1 =>
1 foreachq(`x', ``1', `2', `3', `4'', `x
1 ')dnl
1 =>1
1 error->m4trace: -3- shift(`1', `2', `3', `4')
1 error->m4trace: -2- shift(`1', `2', `3', `4')
1 =>2
1 error->m4trace: -4- shift(`1', `2', `3', `4')
1 error->m4trace: -3- shift(`2', `3', `4')
1 error->m4trace: -3- shift(`1', `2', `3', `4')
1 error->m4trace: -2- shift(`2', `3', `4')
1 =>3
1 error->m4trace: -5- shift(`1', `2', `3', `4')
1 error->m4trace: -4- shift(`2', `3', `4')
1 error->m4trace: -3- shift(`3', `4')
1 error->m4trace: -4- shift(`1', `2', `3', `4')
1 error->m4trace: -3- shift(`2', `3', `4')
1 error->m4trace: -2- shift(`3', `4')
1 =>4
1 error->m4trace: -6- shift(`1', `2', `3', `4')
1 error->m4trace: -5- shift(`2', `3', `4')
1 error->m4trace: -4- shift(`3', `4')
1 error->m4trace: -3- shift(`4')
1
1 Each successive iteration was adding more quoted 'shift' invocations,
1 and the entire list contents were passing through every iteration. In
1 general, when recursing, it is a good idea to make the recursion use
1 fewer arguments, rather than adding additional quoted uses of 'shift'.
1 By doing so, 'm4' uses less memory, invokes fewer macros, is less likely
1 to run into machine limits, and most importantly, performs faster. The
1 fixed version of 'foreachq' can be found in
1 'm4-1.4.18/examples/foreachq2.m4':
1
1 $ m4 -I examples
1 include(`foreachq2.m4')
1 =>
1 undivert(`foreachq2.m4')dnl
1 =>include(`quote.m4')dnl
1 =>divert(`-1')
1 =># foreachq(x, `item_1, item_2, ..., item_n', stmt)
1 =># quoted list, improved version
1 =>define(`foreachq', `pushdef(`$1')_$0($@)popdef(`$1')')
1 =>define(`_arg1q', ``$1'')
1 =>define(`_rest', `ifelse(`$#', `1', `', `dquote(shift($@))')')
1 =>define(`_foreachq', `ifelse(`$2', `', `',
1 => `define(`$1', _arg1q($2))$3`'$0(`$1', _rest($2), `$3')')')
1 =>divert`'dnl
1 traceon(`shift')debugmode(`aq')
1 =>
1 foreachq(`x', ``1', `2', `3', `4'', `x
1 ')dnl
1 =>1
1 error->m4trace: -3- shift(`1', `2', `3', `4')
1 =>2
1 error->m4trace: -3- shift(`2', `3', `4')
1 =>3
1 error->m4trace: -3- shift(`3', `4')
1 =>4
1
1 Note that the fixed version calls unquoted helper macros in
1 '_foreachq' to trim elements immediately; those helper macros in turn
1 must re-supply the layer of quotes lost in the macro invocation.
1 Contrast the use of '_arg1q', which quotes the first list element, with
1 '_arg1' of the earlier implementation that returned the first list
1 element directly. Additionally, by calling the helper method
1 immediately, the 'defn(`ITERATOR')' no longer contains unexpanded
1 macros.
1
1 The astute m4 programmer might notice that the solution above still
1 uses more memory and macro invocations, and thus more time, than
1 strictly necessary. Note that '$2', which contains an arbitrarily long
1 quoted list, is expanded and rescanned three times per iteration of
1 '_foreachq'. Furthermore, every iteration of the algorithm effectively
1 unboxes then reboxes the list, which costs a couple of macro
1 invocations. It is possible to rewrite the algorithm for a bit more
1 speed by swapping the order of the arguments to '_foreachq' in order to
1 operate on an unboxed list in the first place, and by using the
1 fixed-length '$#' instead of an arbitrary length list as the key to end
1 recursion. The result is an overhead of six macro invocations per loop
1 (excluding any macros in TEXT), instead of eight. This alternative
1 approach is available as 'm4-1.4.18/examples/foreach3.m4':
1
1 $ m4 -I examples
1 include(`foreachq3.m4')
1 =>
1 undivert(`foreachq3.m4')dnl
1 =>divert(`-1')
1 =># foreachq(x, `item_1, item_2, ..., item_n', stmt)
1 =># quoted list, alternate improved version
1 =>define(`foreachq', `ifelse(`$2', `', `',
1 => `pushdef(`$1')_$0(`$1', `$3', `', $2)popdef(`$1')')')
1 =>define(`_foreachq', `ifelse(`$#', `3', `',
1 => `define(`$1', `$4')$2`'$0(`$1', `$2',
1 => shift(shift(shift($@))))')')
1 =>divert`'dnl
1 traceon(`shift')debugmode(`aq')
1 =>
1 foreachq(`x', ``1', `2', `3', `4'', `x
1 ')dnl
1 =>1
1 error->m4trace: -4- shift(`x', `x
1 error->', `', `1', `2', `3', `4')
1 error->m4trace: -3- shift(`x
1 error->', `', `1', `2', `3', `4')
1 error->m4trace: -2- shift(`', `1', `2', `3', `4')
1 =>2
1 error->m4trace: -4- shift(`x', `x
1 error->', `1', `2', `3', `4')
1 error->m4trace: -3- shift(`x
1 error->', `1', `2', `3', `4')
1 error->m4trace: -2- shift(`1', `2', `3', `4')
1 =>3
1 error->m4trace: -4- shift(`x', `x
1 error->', `2', `3', `4')
1 error->m4trace: -3- shift(`x
1 error->', `2', `3', `4')
1 error->m4trace: -2- shift(`2', `3', `4')
1 =>4
1 error->m4trace: -4- shift(`x', `x
1 error->', `3', `4')
1 error->m4trace: -3- shift(`x
1 error->', `3', `4')
1 error->m4trace: -2- shift(`3', `4')
1
1 In the current version of M4, every instance of '$@' is rescanned as
1 it is encountered. Thus, the 'foreachq3.m4' alternative uses much less
1 memory than 'foreachq2.m4', and executes as much as 10% faster, since
1 each iteration encounters fewer '$@'. However, the implementation of
1 rescanning every byte in '$@' is quadratic in the number of bytes
1 scanned (for example, making the broken version in 'foreachq.m4' cubic,
1 rather than quadratic, in behavior). A future release of M4 will
1 improve the underlying implementation by reusing results of previous
1 scans, so that both styles of 'foreachq' can become linear in the number
1 of bytes scanned. Notice how the implementation injects an empty
1 argument prior to expanding '$2' within 'foreachq'; the helper macro
1 '_foreachq' then ignores the third argument altogether, and ends
1 recursion when there are three arguments left because there was nothing
1 left to pass through 'shift'. Thus, each iteration only needs one
1 'ifelse', rather than the two conditionals used in the version from
1 'foreachq2.m4'.
1
1 So far, all of the implementations of 'foreachq' presented have been
1 quadratic with M4 1.4.x. But 'forloop' is linear, because each
1 iteration parses a constant amount of arguments. So, it is possible to
1 design a variant that uses 'forloop' to do the iteration, then uses '$@'
1 only once at the end, giving a linear result even with older M4
1 implementations. This implementation relies on the GNU extension that
1 '$10' expands to the tenth argument rather than the first argument
1 concatenated with '0'. The trick is to define an intermediate macro
1 that repeats the text 'm4_define(`$1', `$N')$2`'', with 'n' set to
1 successive integers corresponding to each argument. The helper macro
1 '_foreachq_' is needed in order to generate the literal sequences such
1 as '$1' into the intermediate macro, rather than expanding them as the
1 arguments of '_foreachq'. With this approach, no 'shift' calls are even
1 needed! Even though there are seven macros of overhead per iteration
1 instead of six in 'foreachq3.m4', the linear scaling is apparent at
1 relatively small list sizes. However, this approach will need
1 adjustment when a future version of M4 follows POSIX by no longer
1 treating '$10' as the tenth argument; the anticipation is that '${10}'
1 can be used instead, although that alternative syntax is not yet
1 supported.
1
1 $ m4 -I examples
1 include(`foreachq4.m4')
1 =>
1 undivert(`foreachq4.m4')dnl
1 =>include(`forloop2.m4')dnl
1 =>divert(`-1')
1 =># foreachq(x, `item_1, item_2, ..., item_n', stmt)
1 =># quoted list, version based on forloop
1 =>define(`foreachq',
1 =>`ifelse(`$2', `', `', `_$0(`$1', `$3', $2)')')
1 =>define(`_foreachq',
1 =>`pushdef(`$1', forloop(`$1', `3', `$#',
1 => `$0_(`1', `2', indir(`$1'))')`popdef(
1 => `$1')')indir(`$1', $@)')
1 =>define(`_foreachq_',
1 =>``define(`$$1', `$$3')$$2`''')
1 =>divert`'dnl
1 traceon(`shift')debugmode(`aq')
1 =>
1 foreachq(`x', ``1', `2', `3', `4'', `x
1 ')dnl
1 =>1
1 =>2
1 =>3
1 =>4
1
1 For yet another approach, the improved version of 'foreach',
1 available in 'm4-1.4.18/examples/foreach2.m4', simply overquotes the
1 arguments to '_foreach' to begin with, using 'dquote_elt'. Then
1 '_foreach' can just use '_arg1' to remove the extra layer of quoting
1 that was added up front:
1
1 $ m4 -I examples
1 include(`foreach2.m4')
1 =>
1 undivert(`foreach2.m4')dnl
1 =>include(`quote.m4')dnl
1 =>divert(`-1')
1 =># foreach(x, (item_1, item_2, ..., item_n), stmt)
1 =># parenthesized list, improved version
1 =>define(`foreach', `pushdef(`$1')_$0(`$1',
1 => (dquote(dquote_elt$2)), `$3')popdef(`$1')')
1 =>define(`_arg1', `$1')
1 =>define(`_foreach', `ifelse(`$2', `(`')', `',
1 => `define(`$1', _arg1$2)$3`'$0(`$1', (dquote(shift$2)), `$3')')')
1 =>divert`'dnl
1 traceon(`shift')debugmode(`aq')
1 =>
1 foreach(`x', `(`1', `2', `3', `4')', `x
1 ')dnl
1 error->m4trace: -4- shift(`1', `2', `3', `4')
1 error->m4trace: -4- shift(`2', `3', `4')
1 error->m4trace: -4- shift(`3', `4')
1 =>1
1 error->m4trace: -3- shift(``1'', ``2'', ``3'', ``4'')
1 =>2
1 error->m4trace: -3- shift(``2'', ``3'', ``4'')
1 =>3
1 error->m4trace: -3- shift(``3'', ``4'')
1 =>4
1 error->m4trace: -3- shift(``4'')
1
1 It is likewise possible to write a variant of 'foreach' that performs
1 in linear time on M4 1.4.x; the easiest method is probably writing a
1 version of 'foreach' that unboxes its list, then invokes '_foreachq' as
1 previously defined in 'foreachq4.m4'.
1
1 In summary, recursion over list elements is trickier than it appeared
1 at first glance, but provides a powerful idiom within 'm4' processing.
1 As a final demonstration, both list styles are now able to handle
1 several scenarios that would wreak havoc on one or both of the original
1 implementations. This points out one other difference between the list
1 styles. 'foreach' evaluates unquoted list elements only once, in
1 preparation for calling '_foreach', similary for 'foreachq' as provided
1 by 'foreachq3.m4' or 'foreachq4.m4'. But 'foreachq', as provided by
1 'foreachq2.m4', evaluates unquoted list elements twice while visiting
1 the first list element, once in '_arg1q' and once in '_rest'. When
1 deciding which list style to use, one must take into account whether
1 repeating the side effects of unquoted list elements will have any
1 detrimental effects.
1
1 $ m4 -I examples
1 include(`foreach2.m4')
1 =>
1 include(`foreachq2.m4')
1 =>
1 dnl 0-element list:
1 foreach(`x', `', `<x>') / foreachq(`x', `', `<x>')
1 => /
1 dnl 1-element list of empty element
1 foreach(`x', `()', `<x>') / foreachq(`x', ``'', `<x>')
1 =><> / <>
1 dnl 2-element list of empty elements
1 foreach(`x', `(`',`')', `<x>') / foreachq(`x', ``',`'', `<x>')
1 =><><> / <><>
1 dnl 1-element list of a comma
1 foreach(`x', `(`,')', `<x>') / foreachq(`x', ``,'', `<x>')
1 =><,> / <,>
1 dnl 2-element list of unbalanced parentheses
1 foreach(`x', `(`(', `)')', `<x>') / foreachq(`x', ``(', `)'', `<x>')
1 =><(><)> / <(><)>
1 define(`ab', `oops')dnl using defn(`iterator')
1 foreach(`x', `(`a', `b')', `defn(`x')') /dnl
1 foreachq(`x', ``a', `b'', `defn(`x')')
1 =>ab / ab
1 define(`active', `ACT, IVE')
1 =>
1 traceon(`active')
1 =>
1 dnl list of unquoted macros; expansion occurs before recursion
1 foreach(`x', `(active, active)', `<x>
1 ')dnl
1 error->m4trace: -4- active -> `ACT, IVE'
1 error->m4trace: -4- active -> `ACT, IVE'
1 =><ACT>
1 =><IVE>
1 =><ACT>
1 =><IVE>
1 foreachq(`x', `active, active', `<x>
1 ')dnl
1 error->m4trace: -3- active -> `ACT, IVE'
1 error->m4trace: -3- active -> `ACT, IVE'
1 =><ACT>
1 error->m4trace: -3- active -> `ACT, IVE'
1 error->m4trace: -3- active -> `ACT, IVE'
1 =><IVE>
1 =><ACT>
1 =><IVE>
1 dnl list of quoted macros; expansion occurs during recursion
1 foreach(`x', `(`active', `active')', `<x>
1 ')dnl
1 error->m4trace: -1- active -> `ACT, IVE'
1 =><ACT, IVE>
1 error->m4trace: -1- active -> `ACT, IVE'
1 =><ACT, IVE>
1 foreachq(`x', ``active', `active'', `<x>
1 ')dnl
1 error->m4trace: -1- active -> `ACT, IVE'
1 =><ACT, IVE>
1 error->m4trace: -1- active -> `ACT, IVE'
1 =><ACT, IVE>
1 dnl list of double-quoted macro names; no expansion
1 foreach(`x', `(``active'', ``active'')', `<x>
1 ')dnl
1 =><active>
1 =><active>
1 foreachq(`x', ```active'', ``active''', `<x>
1 ')dnl
1 =><active>
1 =><active>
1