gccint: IPA

1 
1 25.3 Using summary information in IPA passes
1 ============================================
1 
1 Programs are represented internally as a _callgraph_ (a multi-graph
1 where nodes are functions and edges are call sites) and a _varpool_ (a
1 list of static and external variables in the program).
1 
1  The inter-procedural optimization is organized as a sequence of
1 individual passes, which operate on the callgraph and the varpool.  To
1 make the implementation of WHOPR possible, every inter-procedural
1 optimization pass is split into several stages that are executed at
1 different times during WHOPR compilation:
1 
1    * LGEN time
1        1. _Generate summary_ ('generate_summary' in 'struct
1           ipa_opt_pass_d').  This stage analyzes every function body and
1           variable initializer is examined and stores relevant
1           information into a pass-specific data structure.
1 
1        2. _Write summary_ ('write_summary' in 'struct ipa_opt_pass_d').
1           This stage writes all the pass-specific information generated
1           by 'generate_summary'.  Summaries go into their own
1           'LTO_section_*' sections that have to be declared in
1           'lto-streamer.h':'enum lto_section_type'.  A new section is
1           created by calling 'create_output_block' and data can be
1           written using the 'lto_output_*' routines.
1 
1    * WPA time
1        1. _Read summary_ ('read_summary' in 'struct ipa_opt_pass_d').
1           This stage reads all the pass-specific information in exactly
1           the same order that it was written by 'write_summary'.
1 
1        2. _Execute_ ('execute' in 'struct opt_pass').  This performs
1           inter-procedural propagation.  This must be done without
1           actual access to the individual function bodies or variable
1           initializers.  Typically, this results in a transitive closure
1           operation over the summary information of all the nodes in the
1           callgraph.
1 
1        3. _Write optimization summary_ ('write_optimization_summary' in
1           'struct ipa_opt_pass_d').  This writes the result of the
1           inter-procedural propagation into the object file.  This can
1           use the same data structures and helper routines used in
1           'write_summary'.
1 
1    * LTRANS time
1        1. _Read optimization summary_ ('read_optimization_summary' in
1           'struct ipa_opt_pass_d').  The counterpart to
1           'write_optimization_summary'.  This reads the interprocedural
1           optimization decisions in exactly the same format emitted by
1           'write_optimization_summary'.
1 
1        2. _Transform_ ('function_transform' and 'variable_transform' in
1           'struct ipa_opt_pass_d').  The actual function bodies and
1           variable initializers are updated based on the information
1           passed down from the _Execute_ stage.
1 
1  The implementation of the inter-procedural passes are shared between
1 LTO, WHOPR and classic non-LTO compilation.
1 
1    * During the traditional file-by-file mode every pass executes its
1      own _Generate summary_, _Execute_, and _Transform_ stages within
1      the single execution context of the compiler.
1 
1    * In LTO compilation mode, every pass uses _Generate summary_ and
1      _Write summary_ stages at compilation time, while the _Read
1      summary_, _Execute_, and _Transform_ stages are executed at link
1      time.
1 
1    * In WHOPR mode all stages are used.
1 
1  To simplify development, the GCC pass manager differentiates between
1 normal inter-procedural passes and small inter-procedural passes.  A
1 _small inter-procedural pass_ ('SIMPLE_IPA_PASS') is a pass that does
1 everything at once and thus it can not be executed during WPA in WHOPR
1 mode.  It defines only the _Execute_ stage and during this stage it
1 accesses and modifies the function bodies.  Such passes are useful for
1 optimization at LGEN or LTRANS time and are used, for example, to
1 implement early optimization before writing object files.  The simple
1 inter-procedural passes can also be used for easier prototyping and
1 development of a new inter-procedural pass.
1 
1 25.3.1 Virtual clones
1 ---------------------
1 
1 One of the main challenges of introducing the WHOPR compilation mode was
1 addressing the interactions between optimization passes.  In LTO
1 compilation mode, the passes are executed in a sequence, each of which
1 consists of analysis (or _Generate summary_), propagation (or _Execute_)
1 and _Transform_ stages.  Once the work of one pass is finished, the next
1 pass sees the updated program representation and can execute.  This
1 makes the individual passes dependent on each other.
1 
1  In WHOPR mode all passes first execute their _Generate summary_ stage.
1 Then summary writing marks the end of the LGEN stage.  At WPA time, the
1 summaries are read back into memory and all passes run the _Execute_
1 stage.  Optimization summaries are streamed and sent to LTRANS, where
1 all the passes execute the _Transform_ stage.
1 
1  Most optimization passes split naturally into analysis, propagation and
1 transformation stages.  But some do not.  The main problem arises when
1 one pass performs changes and the following pass gets confused by seeing
1 different callgraphs between the _Transform_ stage and the _Generate
1 summary_ or _Execute_ stage.  This means that the passes are required to
1 communicate their decisions with each other.
1 
1  To facilitate this communication, the GCC callgraph infrastructure
1 implements _virtual clones_, a method of representing the changes
1 performed by the optimization passes in the callgraph without needing to
1 update function bodies.
1 
1  A _virtual clone_ in the callgraph is a function that has no associated
1 body, just a description of how to create its body based on a different
1 function (which itself may be a virtual clone).
1 
1  The description of function modifications includes adjustments to the
1 function's signature (which allows, for example, removing or adding
1 function arguments), substitutions to perform on the function body, and,
1 for inlined functions, a pointer to the function that it will be inlined
1 into.
1 
1  It is also possible to redirect any edge of the callgraph from a
1 function to its virtual clone.  This implies updating of the call site
1 to adjust for the new function signature.
1 
1  Most of the transformations performed by inter-procedural optimizations
1 can be represented via virtual clones.  For instance, a constant
1 propagation pass can produce a virtual clone of the function which
1 replaces one of its arguments by a constant.  The inliner can represent
1 its decisions by producing a clone of a function whose body will be
1 later integrated into a given function.
1 
1  Using _virtual clones_, the program can be easily updated during the
1 _Execute_ stage, solving most of pass interactions problems that would
1 otherwise occur during _Transform_.
1 
1  Virtual clones are later materialized in the LTRANS stage and turned
1 into real functions.  Passes executed after the virtual clone were
1 introduced also perform their _Transform_ stage on new functions, so for
1 a pass there is no significant difference between operating on a real
1 function or a virtual clone introduced before its _Execute_ stage.
1 
1  Optimization passes then work on virtual clones introduced before their
1 _Execute_ stage as if they were real functions.  The only difference is
1 that clones are not visible during the _Generate Summary_ stage.
1 
1  To keep function summaries updated, the callgraph interface allows an
1 optimizer to register a callback that is called every time a new clone
1 is introduced as well as when the actual function or variable is
1 generated or when a function or variable is removed.  These hooks are
1 registered in the _Generate summary_ stage and allow the pass to keep
1 its information intact until the _Execute_ stage.  The same hooks can
1 also be registered during the _Execute_ stage to keep the optimization
1 summaries updated for the _Transform_ stage.
1 
1 25.3.2 IPA references
1 ---------------------
1 
1 GCC represents IPA references in the callgraph.  For a function or
1 variable 'A', the _IPA reference_ is a list of all locations where the
1 address of 'A' is taken and, when 'A' is a variable, a list of all
1 direct stores and reads to/from 'A'.  References represent an oriented
1 multi-graph on the union of nodes of the callgraph and the varpool.  See
1 'ipa-reference.c':'ipa_reference_write_optimization_summary' and
1 'ipa-reference.c':'ipa_reference_read_optimization_summary' for details.
1 
1 25.3.3 Jump functions
1 ---------------------
1 
1 Suppose that an optimization pass sees a function 'A' and it knows the
1 values of (some of) its arguments.  The _jump function_ describes the
1 value of a parameter of a given function call in function 'A' based on
1 this knowledge.
1 
1  Jump functions are used by several optimizations, such as the
1 inter-procedural constant propagation pass and the devirtualization
1 pass.  The inliner also uses jump functions to perform inlining of
1 callbacks.
1