automake: Multiple Outputs

1 
1 27.9 Handling Tools that Produce Many Outputs
1 =============================================
1 
1 This section describes a ‘make’ idiom that can be used when a tool
1 produces multiple output files.  It is not specific to Automake and can
1 be used in ordinary ‘Makefile’s.
1 
1    Suppose we have a program called ‘foo’ that will read one file called
1 ‘data.foo’ and produce two files named ‘data.c’ and ‘data.h’.  We want
1 to write a ‘Makefile’ rule that captures this one-to-two dependency.
1 
1    The naive rule is incorrect:
1 
1      # This is incorrect.
1      data.c data.h: data.foo
1              foo data.foo
1 
1 What the above rule really says is that ‘data.c’ and ‘data.h’ each
1 depend on ‘data.foo’, and can each be built by running ‘foo data.foo’.
1 In other words it is equivalent to:
1 
1      # We do not want this.
1      data.c: data.foo
1              foo data.foo
1      data.h: data.foo
1              foo data.foo
1 
1 which means that ‘foo’ can be run twice.  Usually it will not be run
1 twice, because ‘make’ implementations are smart enough to check for the
1 existence of the second file after the first one has been built; they
1 will therefore detect that it already exists.  However there are a few
1 situations where it can run twice anyway:
1 
1    • The most worrying case is when running a parallel ‘make’.  If
1      ‘data.c’ and ‘data.h’ are built in parallel, two ‘foo data.foo’
1      commands will run concurrently.  This is harmful.
1    • Another case is when the dependency (here ‘data.foo’) is (or
1      depends upon) a phony target.
1 
1    A solution that works with parallel ‘make’ but not with phony
1 dependencies is the following:
1 
1      data.c data.h: data.foo
1              foo data.foo
1      data.h: data.c
1 
1 The above rules are equivalent to
1 
1      data.c: data.foo
1              foo data.foo
1      data.h: data.foo data.c
1              foo data.foo
1 
1 therefore a parallel ‘make’ will have to serialize the builds of
1 ‘data.c’ and ‘data.h’, and will detect that the second is no longer
1 needed once the first is over.
1 
1    Using this pattern is probably enough for most cases.  However it
1 does not scale easily to more output files (in this scheme all output
1 files must be totally ordered by the dependency relation), so we will
1 explore a more complicated solution.
1 
1    Another idea is to write the following:
1 
1      # There is still a problem with this one.
1      data.c: data.foo
1              foo data.foo
1      data.h: data.c
1 
1 The idea is that ‘foo data.foo’ is run only when ‘data.c’ needs to be
1 updated, but we further state that ‘data.h’ depends upon ‘data.c’.  That
1 way, if ‘data.h’ is required and ‘data.foo’ is out of date, the
1 dependency on ‘data.c’ will trigger the build.
1 
1    This is almost perfect, but suppose we have built ‘data.h’ and
1 ‘data.c’, and then we erase ‘data.h’.  Then, running ‘make data.h’ will
1 not rebuild ‘data.h’.  The above rules just state that ‘data.c’ must be
1 up-to-date with respect to ‘data.foo’, and this is already the case.
1 
1    What we need is a rule that forces a rebuild when ‘data.h’ is
1 missing.  Here it is:
1 
1      data.c: data.foo
1              foo data.foo
1      data.h: data.c
1      ## Recover from the removal of $@
1              @if test -f $@; then :; else \
1                rm -f data.c; \
1                $(MAKE) $(AM_MAKEFLAGS) data.c; \
1              fi
1 
1    The above scheme can be extended to handle more outputs and more
1 inputs.  One of the outputs is selected to serve as a witness to the
1 successful completion of the command, it depends upon all inputs, and
1 all other outputs depend upon it.  For instance, if ‘foo’ should
1 additionally read ‘data.bar’ and also produce ‘data.w’ and ‘data.x’, we
1 would write:
1 
1      data.c: data.foo data.bar
1              foo data.foo data.bar
1      data.h data.w data.x: data.c
1      ## Recover from the removal of $@
1              @if test -f $@; then :; else \
1                rm -f data.c; \
1                $(MAKE) $(AM_MAKEFLAGS) data.c; \
1              fi
1 
1    However there are now three minor problems in this setup.  One is
1 related to the timestamp ordering of ‘data.h’, ‘data.w’, ‘data.x’, and
1 ‘data.c’.  Another one is a race condition if a parallel ‘make’ attempts
1 to run multiple instances of the recover block at once.  Finally, the
1 recursive rule breaks ‘make -n’ when run with GNU ‘make’ (as well as
1 some other ‘make’ implementations), as it may remove ‘data.h’ even when
11 it should not (⇒How the ‘MAKE’ Variable Works (make)MAKE
 Variable.).
1 
1    Let us deal with the first problem.  ‘foo’ outputs four files, but we
1 do not know in which order these files are created.  Suppose that
1 ‘data.h’ is created before ‘data.c’.  Then we have a weird situation.
1 The next time ‘make’ is run, ‘data.h’ will appear older than ‘data.c’,
1 the second rule will be triggered, a shell will be started to execute
1 the ‘if...fi’ command, but actually it will just execute the ‘then’
1 branch, that is: nothing.  In other words, because the witness we
1 selected is not the first file created by ‘foo’, ‘make’ will start a
1 shell to do nothing each time it is run.
1 
1    A simple riposte is to fix the timestamps when this happens.
1 
1      data.c: data.foo data.bar
1              foo data.foo data.bar
1      data.h data.w data.x: data.c
1              @if test -f $@; then \
1                touch $@; \
1              else \
1      ## Recover from the removal of $@
1                rm -f data.c; \
1                $(MAKE) $(AM_MAKEFLAGS) data.c; \
1              fi
1 
1    Another solution is to use a different and dedicated file as witness,
1 rather than using any of ‘foo’’s outputs.
1 
1      data.stamp: data.foo data.bar
1              @rm -f data.tmp
1              @touch data.tmp
1              foo data.foo data.bar
1              @mv -f data.tmp $@
1      data.c data.h data.w data.x: data.stamp
1      ## Recover from the removal of $@
1              @if test -f $@; then :; else \
1                rm -f data.stamp; \
1                $(MAKE) $(AM_MAKEFLAGS) data.stamp; \
1              fi
1 
1    ‘data.tmp’ is created before ‘foo’ is run, so it has a timestamp
1 older than output files output by ‘foo’.  It is then renamed to
1 ‘data.stamp’ after ‘foo’ has run, because we do not want to update
1 ‘data.stamp’ if ‘foo’ fails.
1 
1    This solution still suffers from the second problem: the race
1 condition in the recover rule.  If, after a successful build, a user
1 erases ‘data.c’ and ‘data.h’, and runs ‘make -j’, then ‘make’ may start
1 both recover rules in parallel.  If the two instances of the rule
1 execute ‘$(MAKE) $(AM_MAKEFLAGS) data.stamp’ concurrently the build is
1 likely to fail (for instance, the two rules will create ‘data.tmp’, but
1 only one can rename it).
1 
1    Admittedly, such a weird situation does not arise during ordinary
1 builds.  It occurs only when the build tree is mutilated.  Here ‘data.c’
1 and ‘data.h’ have been explicitly removed without also removing
1 ‘data.stamp’ and the other output files.  ‘make clean; make’ will always
1 recover from these situations even with parallel makes, so you may
1 decide that the recover rule is solely to help non-parallel make users
1 and leave things as-is.  Fixing this requires some locking mechanism to
1 ensure only one instance of the recover rule rebuilds ‘data.stamp’.  One
1 could imagine something along the following lines.
1 
1      data.c data.h data.w data.x: data.stamp
1      ## Recover from the removal of $@
1              @if test -f $@; then :; else \
1                trap 'rm -rf data.lock data.stamp' 1 2 13 15; \
1      ## mkdir is a portable test-and-set
1                if mkdir data.lock 2>/dev/null; then \
1      ## This code is being executed by the first process.
1                  rm -f data.stamp; \
1                  $(MAKE) $(AM_MAKEFLAGS) data.stamp; \
1                  result=$$?; rm -rf data.lock; exit $$result; \
1                else \
1      ## This code is being executed by the follower processes.
1      ## Wait until the first process is done.
1                  while test -d data.lock; do sleep 1; done; \
1      ## Succeed if and only if the first process succeeded.
1                  test -f data.stamp; \
1                fi; \
1              fi
1 
1    Using a dedicated witness, like ‘data.stamp’, is very handy when the
1 list of output files is not known beforehand.  As an illustration,
1 consider the following rules to compile many ‘*.el’ files into ‘*.elc’
1 files in a single command.  It does not matter how ‘ELFILES’ is defined
1 (as long as it is not empty: empty targets are not accepted by POSIX).
1 
1      ELFILES = one.el two.el three.el ...
1      ELCFILES = $(ELFILES:=c)
1 
1      elc-stamp: $(ELFILES)
1              @rm -f elc-temp
1              @touch elc-temp
1              $(elisp_comp) $(ELFILES)
1              @mv -f elc-temp $@
1 
1      $(ELCFILES): elc-stamp
1              @if test -f $@; then :; else \
1      ## Recover from the removal of $@
1                trap 'rm -rf elc-lock elc-stamp' 1 2 13 15; \
1                if mkdir elc-lock 2>/dev/null; then \
1      ## This code is being executed by the first process.
1                  rm -f elc-stamp; \
1                  $(MAKE) $(AM_MAKEFLAGS) elc-stamp; \
1                  rmdir elc-lock; \
1                else \
1      ## This code is being executed by the follower processes.
1      ## Wait until the first process is done.
1                  while test -d elc-lock; do sleep 1; done; \
1      ## Succeed if and only if the first process succeeded.
1                  test -f elc-stamp; exit $$?; \
1                fi; \
1              fi
1 
1    These solutions all still suffer from the third problem, namely that
1 they break the promise that ‘make -n’ should not cause any actual
1 changes to the tree.  For those solutions that do not create lock files,
1 it is possible to split the recover rules into two separate recipe
1 commands, one of which does all work but the recursion, and the other
1 invokes the recursive ‘$(MAKE)’.  The solutions involving locking could
1 act upon the contents of the ‘MAKEFLAGS’ variable, but parsing that
1 portably is not easy (⇒(autoconf)The Make Macro MAKEFLAGS).  Here
1 is an example:
1 
1      ELFILES = one.el two.el three.el ...
1      ELCFILES = $(ELFILES:=c)
1 
1      elc-stamp: $(ELFILES)
1              @rm -f elc-temp
1              @touch elc-temp
1              $(elisp_comp) $(ELFILES)
1              @mv -f elc-temp $@
1 
1      $(ELCFILES): elc-stamp
1      ## Recover from the removal of $@
1              @dry=; for f in x $$MAKEFLAGS; do \
1                case $$f in \
1                  *=*|--*);; \
1                  *n*) dry=:;; \
1                esac; \
1              done; \
1              if test -f $@; then :; else \
1                $$dry trap 'rm -rf elc-lock elc-stamp' 1 2 13 15; \
1                if $$dry mkdir elc-lock 2>/dev/null; then \
1      ## This code is being executed by the first process.
1                  $$dry rm -f elc-stamp; \
1                  $(MAKE) $(AM_MAKEFLAGS) elc-stamp; \
1                  $$dry rmdir elc-lock; \
1                else \
1      ## This code is being executed by the follower processes.
1      ## Wait until the first process is done.
1                  while test -d elc-lock && test -z "$$dry"; do \
1                    sleep 1; \
1                  done; \
1      ## Succeed if and only if the first process succeeded.
1                  $$dry test -f elc-stamp; exit $$?; \
1                fi; \
1              fi
1 
1    For completeness it should be noted that GNU ‘make’ is able to
11 express rules with multiple output files using pattern rules (⇒
 Pattern Rule Examples (make)Pattern Examples.).  We do not discuss
1 pattern rules here because they are not portable, but they can be
1 convenient in packages that assume GNU ‘make’.
1