m4: Improved foreach

1 
1 17.3 Solution for 'foreach'
1 ===========================
1 
1 The 'foreach' and 'foreachq' macros (⇒Foreach) as presented
1 earlier each have flaws.  First, we will examine and fix the quadratic
1 behavior of 'foreachq':
1 
1      $ m4 -I examples
1      include(`foreachq.m4')
1      =>
1      traceon(`shift')debugmode(`aq')
1      =>
1      foreachq(`x', ``1', `2', `3', `4'', `x
1      ')dnl
1      =>1
1      error->m4trace: -3- shift(`1', `2', `3', `4')
1      error->m4trace: -2- shift(`1', `2', `3', `4')
1      =>2
1      error->m4trace: -4- shift(`1', `2', `3', `4')
1      error->m4trace: -3- shift(`2', `3', `4')
1      error->m4trace: -3- shift(`1', `2', `3', `4')
1      error->m4trace: -2- shift(`2', `3', `4')
1      =>3
1      error->m4trace: -5- shift(`1', `2', `3', `4')
1      error->m4trace: -4- shift(`2', `3', `4')
1      error->m4trace: -3- shift(`3', `4')
1      error->m4trace: -4- shift(`1', `2', `3', `4')
1      error->m4trace: -3- shift(`2', `3', `4')
1      error->m4trace: -2- shift(`3', `4')
1      =>4
1      error->m4trace: -6- shift(`1', `2', `3', `4')
1      error->m4trace: -5- shift(`2', `3', `4')
1      error->m4trace: -4- shift(`3', `4')
1      error->m4trace: -3- shift(`4')
1 
1    Each successive iteration was adding more quoted 'shift' invocations,
1 and the entire list contents were passing through every iteration.  In
1 general, when recursing, it is a good idea to make the recursion use
1 fewer arguments, rather than adding additional quoted uses of 'shift'.
1 By doing so, 'm4' uses less memory, invokes fewer macros, is less likely
1 to run into machine limits, and most importantly, performs faster.  The
1 fixed version of 'foreachq' can be found in
1 'm4-1.4.18/examples/foreachq2.m4':
1 
1      $ m4 -I examples
1      include(`foreachq2.m4')
1      =>
1      undivert(`foreachq2.m4')dnl
1      =>include(`quote.m4')dnl
1      =>divert(`-1')
1      =># foreachq(x, `item_1, item_2, ..., item_n', stmt)
1      =>#   quoted list, improved version
1      =>define(`foreachq', `pushdef(`$1')_$0($@)popdef(`$1')')
1      =>define(`_arg1q', ``$1'')
1      =>define(`_rest', `ifelse(`$#', `1', `', `dquote(shift($@))')')
1      =>define(`_foreachq', `ifelse(`$2', `', `',
1      =>  `define(`$1', _arg1q($2))$3`'$0(`$1', _rest($2), `$3')')')
1      =>divert`'dnl
1      traceon(`shift')debugmode(`aq')
1      =>
1      foreachq(`x', ``1', `2', `3', `4'', `x
1      ')dnl
1      =>1
1      error->m4trace: -3- shift(`1', `2', `3', `4')
1      =>2
1      error->m4trace: -3- shift(`2', `3', `4')
1      =>3
1      error->m4trace: -3- shift(`3', `4')
1      =>4
1 
1    Note that the fixed version calls unquoted helper macros in
1 '_foreachq' to trim elements immediately; those helper macros in turn
1 must re-supply the layer of quotes lost in the macro invocation.
1 Contrast the use of '_arg1q', which quotes the first list element, with
1 '_arg1' of the earlier implementation that returned the first list
1 element directly.  Additionally, by calling the helper method
1 immediately, the 'defn(`ITERATOR')' no longer contains unexpanded
1 macros.
1 
1    The astute m4 programmer might notice that the solution above still
1 uses more memory and macro invocations, and thus more time, than
1 strictly necessary.  Note that '$2', which contains an arbitrarily long
1 quoted list, is expanded and rescanned three times per iteration of
1 '_foreachq'.  Furthermore, every iteration of the algorithm effectively
1 unboxes then reboxes the list, which costs a couple of macro
1 invocations.  It is possible to rewrite the algorithm for a bit more
1 speed by swapping the order of the arguments to '_foreachq' in order to
1 operate on an unboxed list in the first place, and by using the
1 fixed-length '$#' instead of an arbitrary length list as the key to end
1 recursion.  The result is an overhead of six macro invocations per loop
1 (excluding any macros in TEXT), instead of eight.  This alternative
1 approach is available as 'm4-1.4.18/examples/foreach3.m4':
1 
1      $ m4 -I examples
1      include(`foreachq3.m4')
1      =>
1      undivert(`foreachq3.m4')dnl
1      =>divert(`-1')
1      =># foreachq(x, `item_1, item_2, ..., item_n', stmt)
1      =>#   quoted list, alternate improved version
1      =>define(`foreachq', `ifelse(`$2', `', `',
1      =>  `pushdef(`$1')_$0(`$1', `$3', `', $2)popdef(`$1')')')
1      =>define(`_foreachq', `ifelse(`$#', `3', `',
1      =>  `define(`$1', `$4')$2`'$0(`$1', `$2',
1      =>    shift(shift(shift($@))))')')
1      =>divert`'dnl
1      traceon(`shift')debugmode(`aq')
1      =>
1      foreachq(`x', ``1', `2', `3', `4'', `x
1      ')dnl
1      =>1
1      error->m4trace: -4- shift(`x', `x
1      error->', `', `1', `2', `3', `4')
1      error->m4trace: -3- shift(`x
1      error->', `', `1', `2', `3', `4')
1      error->m4trace: -2- shift(`', `1', `2', `3', `4')
1      =>2
1      error->m4trace: -4- shift(`x', `x
1      error->', `1', `2', `3', `4')
1      error->m4trace: -3- shift(`x
1      error->', `1', `2', `3', `4')
1      error->m4trace: -2- shift(`1', `2', `3', `4')
1      =>3
1      error->m4trace: -4- shift(`x', `x
1      error->', `2', `3', `4')
1      error->m4trace: -3- shift(`x
1      error->', `2', `3', `4')
1      error->m4trace: -2- shift(`2', `3', `4')
1      =>4
1      error->m4trace: -4- shift(`x', `x
1      error->', `3', `4')
1      error->m4trace: -3- shift(`x
1      error->', `3', `4')
1      error->m4trace: -2- shift(`3', `4')
1 
1    In the current version of M4, every instance of '$@' is rescanned as
1 it is encountered.  Thus, the 'foreachq3.m4' alternative uses much less
1 memory than 'foreachq2.m4', and executes as much as 10% faster, since
1 each iteration encounters fewer '$@'.  However, the implementation of
1 rescanning every byte in '$@' is quadratic in the number of bytes
1 scanned (for example, making the broken version in 'foreachq.m4' cubic,
1 rather than quadratic, in behavior).  A future release of M4 will
1 improve the underlying implementation by reusing results of previous
1 scans, so that both styles of 'foreachq' can become linear in the number
1 of bytes scanned.  Notice how the implementation injects an empty
1 argument prior to expanding '$2' within 'foreachq'; the helper macro
1 '_foreachq' then ignores the third argument altogether, and ends
1 recursion when there are three arguments left because there was nothing
1 left to pass through 'shift'.  Thus, each iteration only needs one
1 'ifelse', rather than the two conditionals used in the version from
1 'foreachq2.m4'.
1 
1    So far, all of the implementations of 'foreachq' presented have been
1 quadratic with M4 1.4.x.  But 'forloop' is linear, because each
1 iteration parses a constant amount of arguments.  So, it is possible to
1 design a variant that uses 'forloop' to do the iteration, then uses '$@'
1 only once at the end, giving a linear result even with older M4
1 implementations.  This implementation relies on the GNU extension that
1 '$10' expands to the tenth argument rather than the first argument
1 concatenated with '0'.  The trick is to define an intermediate macro
1 that repeats the text 'm4_define(`$1', `$N')$2`'', with 'n' set to
1 successive integers corresponding to each argument.  The helper macro
1 '_foreachq_' is needed in order to generate the literal sequences such
1 as '$1' into the intermediate macro, rather than expanding them as the
1 arguments of '_foreachq'.  With this approach, no 'shift' calls are even
1 needed!  Even though there are seven macros of overhead per iteration
1 instead of six in 'foreachq3.m4', the linear scaling is apparent at
1 relatively small list sizes.  However, this approach will need
1 adjustment when a future version of M4 follows POSIX by no longer
1 treating '$10' as the tenth argument; the anticipation is that '${10}'
1 can be used instead, although that alternative syntax is not yet
1 supported.
1 
1      $ m4 -I examples
1      include(`foreachq4.m4')
1      =>
1      undivert(`foreachq4.m4')dnl
1      =>include(`forloop2.m4')dnl
1      =>divert(`-1')
1      =># foreachq(x, `item_1, item_2, ..., item_n', stmt)
1      =>#   quoted list, version based on forloop
1      =>define(`foreachq',
1      =>`ifelse(`$2', `', `', `_$0(`$1', `$3', $2)')')
1      =>define(`_foreachq',
1      =>`pushdef(`$1', forloop(`$1', `3', `$#',
1      =>  `$0_(`1', `2', indir(`$1'))')`popdef(
1      =>    `$1')')indir(`$1', $@)')
1      =>define(`_foreachq_',
1      =>``define(`$$1', `$$3')$$2`''')
1      =>divert`'dnl
1      traceon(`shift')debugmode(`aq')
1      =>
1      foreachq(`x', ``1', `2', `3', `4'', `x
1      ')dnl
1      =>1
1      =>2
1      =>3
1      =>4
1 
1    For yet another approach, the improved version of 'foreach',
1 available in 'm4-1.4.18/examples/foreach2.m4', simply overquotes the
1 arguments to '_foreach' to begin with, using 'dquote_elt'.  Then
1 '_foreach' can just use '_arg1' to remove the extra layer of quoting
1 that was added up front:
1 
1      $ m4 -I examples
1      include(`foreach2.m4')
1      =>
1      undivert(`foreach2.m4')dnl
1      =>include(`quote.m4')dnl
1      =>divert(`-1')
1      =># foreach(x, (item_1, item_2, ..., item_n), stmt)
1      =>#   parenthesized list, improved version
1      =>define(`foreach', `pushdef(`$1')_$0(`$1',
1      =>  (dquote(dquote_elt$2)), `$3')popdef(`$1')')
1      =>define(`_arg1', `$1')
1      =>define(`_foreach', `ifelse(`$2', `(`')', `',
1      =>  `define(`$1', _arg1$2)$3`'$0(`$1', (dquote(shift$2)), `$3')')')
1      =>divert`'dnl
1      traceon(`shift')debugmode(`aq')
1      =>
1      foreach(`x', `(`1', `2', `3', `4')', `x
1      ')dnl
1      error->m4trace: -4- shift(`1', `2', `3', `4')
1      error->m4trace: -4- shift(`2', `3', `4')
1      error->m4trace: -4- shift(`3', `4')
1      =>1
1      error->m4trace: -3- shift(``1'', ``2'', ``3'', ``4'')
1      =>2
1      error->m4trace: -3- shift(``2'', ``3'', ``4'')
1      =>3
1      error->m4trace: -3- shift(``3'', ``4'')
1      =>4
1      error->m4trace: -3- shift(``4'')
1 
1    It is likewise possible to write a variant of 'foreach' that performs
1 in linear time on M4 1.4.x; the easiest method is probably writing a
1 version of 'foreach' that unboxes its list, then invokes '_foreachq' as
1 previously defined in 'foreachq4.m4'.
1 
1    In summary, recursion over list elements is trickier than it appeared
1 at first glance, but provides a powerful idiom within 'm4' processing.
1 As a final demonstration, both list styles are now able to handle
1 several scenarios that would wreak havoc on one or both of the original
1 implementations.  This points out one other difference between the list
1 styles.  'foreach' evaluates unquoted list elements only once, in
1 preparation for calling '_foreach', similary for 'foreachq' as provided
1 by 'foreachq3.m4' or 'foreachq4.m4'.  But 'foreachq', as provided by
1 'foreachq2.m4', evaluates unquoted list elements twice while visiting
1 the first list element, once in '_arg1q' and once in '_rest'.  When
1 deciding which list style to use, one must take into account whether
1 repeating the side effects of unquoted list elements will have any
1 detrimental effects.
1 
1      $ m4 -I examples
1      include(`foreach2.m4')
1      =>
1      include(`foreachq2.m4')
1      =>
1      dnl 0-element list:
1      foreach(`x', `', `<x>') / foreachq(`x', `', `<x>')
1      => / 
1      dnl 1-element list of empty element
1      foreach(`x', `()', `<x>') / foreachq(`x', ``'', `<x>')
1      =><> / <>
1      dnl 2-element list of empty elements
1      foreach(`x', `(`',`')', `<x>') / foreachq(`x', ``',`'', `<x>')
1      =><><> / <><>
1      dnl 1-element list of a comma
1      foreach(`x', `(`,')', `<x>') / foreachq(`x', ``,'', `<x>')
1      =><,> / <,>
1      dnl 2-element list of unbalanced parentheses
1      foreach(`x', `(`(', `)')', `<x>') / foreachq(`x', ``(', `)'', `<x>')
1      =><(><)> / <(><)>
1      define(`ab', `oops')dnl using defn(`iterator')
1      foreach(`x', `(`a', `b')', `defn(`x')') /dnl
1       foreachq(`x', ``a', `b'', `defn(`x')')
1      =>ab / ab
1      define(`active', `ACT, IVE')
1      =>
1      traceon(`active')
1      =>
1      dnl list of unquoted macros; expansion occurs before recursion
1      foreach(`x', `(active, active)', `<x>
1      ')dnl
1      error->m4trace: -4- active -> `ACT, IVE'
1      error->m4trace: -4- active -> `ACT, IVE'
1      =><ACT>
1      =><IVE>
1      =><ACT>
1      =><IVE>
1      foreachq(`x', `active, active', `<x>
1      ')dnl
1      error->m4trace: -3- active -> `ACT, IVE'
1      error->m4trace: -3- active -> `ACT, IVE'
1      =><ACT>
1      error->m4trace: -3- active -> `ACT, IVE'
1      error->m4trace: -3- active -> `ACT, IVE'
1      =><IVE>
1      =><ACT>
1      =><IVE>
1      dnl list of quoted macros; expansion occurs during recursion
1      foreach(`x', `(`active', `active')', `<x>
1      ')dnl
1      error->m4trace: -1- active -> `ACT, IVE'
1      =><ACT, IVE>
1      error->m4trace: -1- active -> `ACT, IVE'
1      =><ACT, IVE>
1      foreachq(`x', ``active', `active'', `<x>
1      ')dnl
1      error->m4trace: -1- active -> `ACT, IVE'
1      =><ACT, IVE>
1      error->m4trace: -1- active -> `ACT, IVE'
1      =><ACT, IVE>
1      dnl list of double-quoted macro names; no expansion
1      foreach(`x', `(``active'', ``active'')', `<x>
1      ')dnl
1      =><active>
1      =><active>
1      foreachq(`x', ```active'', ``active''', `<x>
1      ')dnl
1      =><active>
1      =><active>
1