gccint: IPA
1
1 25.3 Using summary information in IPA passes
1 ============================================
1
1 Programs are represented internally as a _callgraph_ (a multi-graph
1 where nodes are functions and edges are call sites) and a _varpool_ (a
1 list of static and external variables in the program).
1
1 The inter-procedural optimization is organized as a sequence of
1 individual passes, which operate on the callgraph and the varpool. To
1 make the implementation of WHOPR possible, every inter-procedural
1 optimization pass is split into several stages that are executed at
1 different times during WHOPR compilation:
1
1 * LGEN time
1 1. _Generate summary_ ('generate_summary' in 'struct
1 ipa_opt_pass_d'). This stage analyzes every function body and
1 variable initializer is examined and stores relevant
1 information into a pass-specific data structure.
1
1 2. _Write summary_ ('write_summary' in 'struct ipa_opt_pass_d').
1 This stage writes all the pass-specific information generated
1 by 'generate_summary'. Summaries go into their own
1 'LTO_section_*' sections that have to be declared in
1 'lto-streamer.h':'enum lto_section_type'. A new section is
1 created by calling 'create_output_block' and data can be
1 written using the 'lto_output_*' routines.
1
1 * WPA time
1 1. _Read summary_ ('read_summary' in 'struct ipa_opt_pass_d').
1 This stage reads all the pass-specific information in exactly
1 the same order that it was written by 'write_summary'.
1
1 2. _Execute_ ('execute' in 'struct opt_pass'). This performs
1 inter-procedural propagation. This must be done without
1 actual access to the individual function bodies or variable
1 initializers. Typically, this results in a transitive closure
1 operation over the summary information of all the nodes in the
1 callgraph.
1
1 3. _Write optimization summary_ ('write_optimization_summary' in
1 'struct ipa_opt_pass_d'). This writes the result of the
1 inter-procedural propagation into the object file. This can
1 use the same data structures and helper routines used in
1 'write_summary'.
1
1 * LTRANS time
1 1. _Read optimization summary_ ('read_optimization_summary' in
1 'struct ipa_opt_pass_d'). The counterpart to
1 'write_optimization_summary'. This reads the interprocedural
1 optimization decisions in exactly the same format emitted by
1 'write_optimization_summary'.
1
1 2. _Transform_ ('function_transform' and 'variable_transform' in
1 'struct ipa_opt_pass_d'). The actual function bodies and
1 variable initializers are updated based on the information
1 passed down from the _Execute_ stage.
1
1 The implementation of the inter-procedural passes are shared between
1 LTO, WHOPR and classic non-LTO compilation.
1
1 * During the traditional file-by-file mode every pass executes its
1 own _Generate summary_, _Execute_, and _Transform_ stages within
1 the single execution context of the compiler.
1
1 * In LTO compilation mode, every pass uses _Generate summary_ and
1 _Write summary_ stages at compilation time, while the _Read
1 summary_, _Execute_, and _Transform_ stages are executed at link
1 time.
1
1 * In WHOPR mode all stages are used.
1
1 To simplify development, the GCC pass manager differentiates between
1 normal inter-procedural passes and small inter-procedural passes. A
1 _small inter-procedural pass_ ('SIMPLE_IPA_PASS') is a pass that does
1 everything at once and thus it can not be executed during WPA in WHOPR
1 mode. It defines only the _Execute_ stage and during this stage it
1 accesses and modifies the function bodies. Such passes are useful for
1 optimization at LGEN or LTRANS time and are used, for example, to
1 implement early optimization before writing object files. The simple
1 inter-procedural passes can also be used for easier prototyping and
1 development of a new inter-procedural pass.
1
1 25.3.1 Virtual clones
1 ---------------------
1
1 One of the main challenges of introducing the WHOPR compilation mode was
1 addressing the interactions between optimization passes. In LTO
1 compilation mode, the passes are executed in a sequence, each of which
1 consists of analysis (or _Generate summary_), propagation (or _Execute_)
1 and _Transform_ stages. Once the work of one pass is finished, the next
1 pass sees the updated program representation and can execute. This
1 makes the individual passes dependent on each other.
1
1 In WHOPR mode all passes first execute their _Generate summary_ stage.
1 Then summary writing marks the end of the LGEN stage. At WPA time, the
1 summaries are read back into memory and all passes run the _Execute_
1 stage. Optimization summaries are streamed and sent to LTRANS, where
1 all the passes execute the _Transform_ stage.
1
1 Most optimization passes split naturally into analysis, propagation and
1 transformation stages. But some do not. The main problem arises when
1 one pass performs changes and the following pass gets confused by seeing
1 different callgraphs between the _Transform_ stage and the _Generate
1 summary_ or _Execute_ stage. This means that the passes are required to
1 communicate their decisions with each other.
1
1 To facilitate this communication, the GCC callgraph infrastructure
1 implements _virtual clones_, a method of representing the changes
1 performed by the optimization passes in the callgraph without needing to
1 update function bodies.
1
1 A _virtual clone_ in the callgraph is a function that has no associated
1 body, just a description of how to create its body based on a different
1 function (which itself may be a virtual clone).
1
1 The description of function modifications includes adjustments to the
1 function's signature (which allows, for example, removing or adding
1 function arguments), substitutions to perform on the function body, and,
1 for inlined functions, a pointer to the function that it will be inlined
1 into.
1
1 It is also possible to redirect any edge of the callgraph from a
1 function to its virtual clone. This implies updating of the call site
1 to adjust for the new function signature.
1
1 Most of the transformations performed by inter-procedural optimizations
1 can be represented via virtual clones. For instance, a constant
1 propagation pass can produce a virtual clone of the function which
1 replaces one of its arguments by a constant. The inliner can represent
1 its decisions by producing a clone of a function whose body will be
1 later integrated into a given function.
1
1 Using _virtual clones_, the program can be easily updated during the
1 _Execute_ stage, solving most of pass interactions problems that would
1 otherwise occur during _Transform_.
1
1 Virtual clones are later materialized in the LTRANS stage and turned
1 into real functions. Passes executed after the virtual clone were
1 introduced also perform their _Transform_ stage on new functions, so for
1 a pass there is no significant difference between operating on a real
1 function or a virtual clone introduced before its _Execute_ stage.
1
1 Optimization passes then work on virtual clones introduced before their
1 _Execute_ stage as if they were real functions. The only difference is
1 that clones are not visible during the _Generate Summary_ stage.
1
1 To keep function summaries updated, the callgraph interface allows an
1 optimizer to register a callback that is called every time a new clone
1 is introduced as well as when the actual function or variable is
1 generated or when a function or variable is removed. These hooks are
1 registered in the _Generate summary_ stage and allow the pass to keep
1 its information intact until the _Execute_ stage. The same hooks can
1 also be registered during the _Execute_ stage to keep the optimization
1 summaries updated for the _Transform_ stage.
1
1 25.3.2 IPA references
1 ---------------------
1
1 GCC represents IPA references in the callgraph. For a function or
1 variable 'A', the _IPA reference_ is a list of all locations where the
1 address of 'A' is taken and, when 'A' is a variable, a list of all
1 direct stores and reads to/from 'A'. References represent an oriented
1 multi-graph on the union of nodes of the callgraph and the varpool. See
1 'ipa-reference.c':'ipa_reference_write_optimization_summary' and
1 'ipa-reference.c':'ipa_reference_read_optimization_summary' for details.
1
1 25.3.3 Jump functions
1 ---------------------
1
1 Suppose that an optimization pass sees a function 'A' and it knows the
1 values of (some of) its arguments. The _jump function_ describes the
1 value of a parameter of a given function call in function 'A' based on
1 this knowledge.
1
1 Jump functions are used by several optimizations, such as the
1 inter-procedural constant propagation pass and the devirtualization
1 pass. The inliner also uses jump functions to perform inlining of
1 callbacks.
1