gccint: Tree SSA passes
1
1 9.4 Tree SSA passes
1 ===================
1
1 The following briefly describes the Tree optimization passes that are
1 run after gimplification and what source files they are located in.
1
1 * Remove useless statements
1
1 This pass is an extremely simple sweep across the gimple code in
1 which we identify obviously dead code and remove it. Here we do
1 things like simplify 'if' statements with constant conditions,
1 remove exception handling constructs surrounding code that
1 obviously cannot throw, remove lexical bindings that contain no
1 variables, and other assorted simplistic cleanups. The idea is to
1 get rid of the obvious stuff quickly rather than wait until later
1 when it's more work to get rid of it. This pass is located in
1 'tree-cfg.c' and described by 'pass_remove_useless_stmts'.
1
1 * OpenMP lowering
1
1 If OpenMP generation ('-fopenmp') is enabled, this pass lowers
1 OpenMP constructs into GIMPLE.
1
1 Lowering of OpenMP constructs involves creating replacement
1 expressions for local variables that have been mapped using data
1 sharing clauses, exposing the control flow of most synchronization
1 directives and adding region markers to facilitate the creation of
1 the control flow graph. The pass is located in 'omp-low.c' and is
1 described by 'pass_lower_omp'.
1
1 * OpenMP expansion
1
1 If OpenMP generation ('-fopenmp') is enabled, this pass expands
1 parallel regions into their own functions to be invoked by the
1 thread library. The pass is located in 'omp-low.c' and is
1 described by 'pass_expand_omp'.
1
1 * Lower control flow
1
1 This pass flattens 'if' statements ('COND_EXPR') and moves lexical
1 bindings ('BIND_EXPR') out of line. After this pass, all 'if'
1 statements will have exactly two 'goto' statements in its 'then'
1 and 'else' arms. Lexical binding information for each statement
1 will be found in 'TREE_BLOCK' rather than being inferred from its
1 position under a 'BIND_EXPR'. This pass is found in 'gimple-low.c'
1 and is described by 'pass_lower_cf'.
1
1 * Lower exception handling control flow
1
1 This pass decomposes high-level exception handling constructs
1 ('TRY_FINALLY_EXPR' and 'TRY_CATCH_EXPR') into a form that
1 explicitly represents the control flow involved. After this pass,
1 'lookup_stmt_eh_region' will return a non-negative number for any
1 statement that may have EH control flow semantics; examine
1 'tree_can_throw_internal' or 'tree_can_throw_external' for exact
1 semantics. Exact control flow may be extracted from
1 'foreach_reachable_handler'. The EH region nesting tree is defined
1 in 'except.h' and built in 'except.c'. The lowering pass itself is
1 in 'tree-eh.c' and is described by 'pass_lower_eh'.
1
1 * Build the control flow graph
1
1 This pass decomposes a function into basic blocks and creates all
1 of the edges that connect them. It is located in 'tree-cfg.c' and
1 is described by 'pass_build_cfg'.
1
1 * Find all referenced variables
1
1 This pass walks the entire function and collects an array of all
1 variables referenced in the function, 'referenced_vars'. The index
1 at which a variable is found in the array is used as a UID for the
1 variable within this function. This data is needed by the SSA
1 rewriting routines. The pass is located in 'tree-dfa.c' and is
1 described by 'pass_referenced_vars'.
1
1 * Enter static single assignment form
1
1 This pass rewrites the function such that it is in SSA form. After
1 this pass, all 'is_gimple_reg' variables will be referenced by
1 'SSA_NAME', and all occurrences of other variables will be
1 annotated with 'VDEFS' and 'VUSES'; PHI nodes will have been
1 inserted as necessary for each basic block. This pass is located
1 in 'tree-ssa.c' and is described by 'pass_build_ssa'.
1
1 * Warn for uninitialized variables
1
1 This pass scans the function for uses of 'SSA_NAME's that are fed
1 by default definition. For non-parameter variables, such uses are
1 uninitialized. The pass is run twice, before and after
1 optimization (if turned on). In the first pass we only warn for
1 uses that are positively uninitialized; in the second pass we warn
1 for uses that are possibly uninitialized. The pass is located in
1 'tree-ssa.c' and is defined by 'pass_early_warn_uninitialized' and
1 'pass_late_warn_uninitialized'.
1
1 * Dead code elimination
1
1 This pass scans the function for statements without side effects
1 whose result is unused. It does not do memory life analysis, so
1 any value that is stored in memory is considered used. The pass is
1 run multiple times throughout the optimization process. It is
1 located in 'tree-ssa-dce.c' and is described by 'pass_dce'.
1
1 * Dominator optimizations
1
1 This pass performs trivial dominator-based copy and constant
1 propagation, expression simplification, and jump threading. It is
1 run multiple times throughout the optimization process. It is
1 located in 'tree-ssa-dom.c' and is described by 'pass_dominator'.
1
1 * Forward propagation of single-use variables
1
1 This pass attempts to remove redundant computation by substituting
1 variables that are used once into the expression that uses them and
1 seeing if the result can be simplified. It is located in
1 'tree-ssa-forwprop.c' and is described by 'pass_forwprop'.
1
1 * Copy Renaming
1
1 This pass attempts to change the name of compiler temporaries
1 involved in copy operations such that SSA->normal can coalesce the
1 copy away. When compiler temporaries are copies of user variables,
1 it also renames the compiler temporary to the user variable
1 resulting in better use of user symbols. It is located in
1 'tree-ssa-copyrename.c' and is described by 'pass_copyrename'.
1
1 * PHI node optimizations
1
1 This pass recognizes forms of PHI inputs that can be represented as
1 conditional expressions and rewrites them into straight line code.
1 It is located in 'tree-ssa-phiopt.c' and is described by
1 'pass_phiopt'.
1
1 * May-alias optimization
1
1 This pass performs a flow sensitive SSA-based points-to analysis.
1 The resulting may-alias, must-alias, and escape analysis
1 information is used to promote variables from in-memory addressable
1 objects to non-aliased variables that can be renamed into SSA form.
1 We also update the 'VDEF'/'VUSE' memory tags for non-renameable
1 aggregates so that we get fewer false kills. The pass is located
1 in 'tree-ssa-alias.c' and is described by 'pass_may_alias'.
1
1 Interprocedural points-to information is located in
1 'tree-ssa-structalias.c' and described by 'pass_ipa_pta'.
1
1 * Profiling
1
1 This pass instruments the function in order to collect runtime
1 block and value profiling data. Such data may be fed back into the
1 compiler on a subsequent run so as to allow optimization based on
1 expected execution frequencies. The pass is located in
1 'tree-profile.c' and is described by 'pass_ipa_tree_profile'.
1
1 * Static profile estimation
1
1 This pass implements series of heuristics to guess propababilities
1 of branches. The resulting predictions are turned into edge
1 profile by propagating branches across the control flow graphs.
1 The pass is located in 'tree-profile.c' and is described by
1 'pass_profile'.
1
1 * Lower complex arithmetic
1
1 This pass rewrites complex arithmetic operations into their
1 component scalar arithmetic operations. The pass is located in
1 'tree-complex.c' and is described by 'pass_lower_complex'.
1
1 * Scalar replacement of aggregates
1
1 This pass rewrites suitable non-aliased local aggregate variables
1 into a set of scalar variables. The resulting scalar variables are
1 rewritten into SSA form, which allows subsequent optimization
1 passes to do a significantly better job with them. The pass is
1 located in 'tree-sra.c' and is described by 'pass_sra'.
1
1 * Dead store elimination
1
1 This pass eliminates stores to memory that are subsequently
1 overwritten by another store, without any intervening loads. The
1 pass is located in 'tree-ssa-dse.c' and is described by 'pass_dse'.
1
1 * Tail recursion elimination
1
1 This pass transforms tail recursion into a loop. It is located in
1 'tree-tailcall.c' and is described by 'pass_tail_recursion'.
1
1 * Forward store motion
1
1 This pass sinks stores and assignments down the flowgraph closer to
1 their use point. The pass is located in 'tree-ssa-sink.c' and is
1 described by 'pass_sink_code'.
1
1 * Partial redundancy elimination
1
1 This pass eliminates partially redundant computations, as well as
1 performing load motion. The pass is located in 'tree-ssa-pre.c'
1 and is described by 'pass_pre'.
1
1 Just before partial redundancy elimination, if
1 '-funsafe-math-optimizations' is on, GCC tries to convert divisions
1 to multiplications by the reciprocal. The pass is located in
1 'tree-ssa-math-opts.c' and is described by 'pass_cse_reciprocal'.
1
1 * Full redundancy elimination
1
1 This is a simpler form of PRE that only eliminates redundancies
1 that occur on all paths. It is located in 'tree-ssa-pre.c' and
1 described by 'pass_fre'.
1
1 * Loop optimization
1
1 The main driver of the pass is placed in 'tree-ssa-loop.c' and
1 described by 'pass_loop'.
1
1 The optimizations performed by this pass are:
1
1 Loop invariant motion. This pass moves only invariants that would
1 be hard to handle on RTL level (function calls, operations that
1 expand to nontrivial sequences of insns). With '-funswitch-loops'
1 it also moves operands of conditions that are invariant out of the
1 loop, so that we can use just trivial invariantness analysis in
1 loop unswitching. The pass also includes store motion. The pass
1 is implemented in 'tree-ssa-loop-im.c'.
1
1 Canonical induction variable creation. This pass creates a simple
1 counter for number of iterations of the loop and replaces the exit
1 condition of the loop using it, in case when a complicated analysis
1 is necessary to determine the number of iterations. Later
1 optimizations then may determine the number easily. The pass is
1 implemented in 'tree-ssa-loop-ivcanon.c'.
1
1 Induction variable optimizations. This pass performs standard
1 induction variable optimizations, including strength reduction,
1 induction variable merging and induction variable elimination. The
1 pass is implemented in 'tree-ssa-loop-ivopts.c'.
1
1 Loop unswitching. This pass moves the conditional jumps that are
1 invariant out of the loops. To achieve this, a duplicate of the
1 loop is created for each possible outcome of conditional jump(s).
1 The pass is implemented in 'tree-ssa-loop-unswitch.c'.
1
1 Loop splitting. If a loop contains a conditional statement that is
1 always true for one part of the iteration space and false for the
1 other this pass splits the loop into two, one dealing with one side
1 the other only with the other, thereby removing one inner-loop
1 conditional. The pass is implemented in 'tree-ssa-loop-split.c'.
1
1 The optimizations also use various utility functions contained in
1 'tree-ssa-loop-manip.c', 'cfgloop.c', 'cfgloopanal.c' and
1 'cfgloopmanip.c'.
1
1 Vectorization. This pass transforms loops to operate on vector
1 types instead of scalar types. Data parallelism across loop
1 iterations is exploited to group data elements from consecutive
1 iterations into a vector and operate on them in parallel.
1 Depending on available target support the loop is conceptually
1 unrolled by a factor 'VF' (vectorization factor), which is the
1 number of elements operated upon in parallel in each iteration, and
1 the 'VF' copies of each scalar operation are fused to form a vector
1 operation. Additional loop transformations such as peeling and
1 versioning may take place to align the number of iterations, and to
1 align the memory accesses in the loop. The pass is implemented in
1 'tree-vectorizer.c' (the main driver), 'tree-vect-loop.c' and
1 'tree-vect-loop-manip.c' (loop specific parts and general loop
1 utilities), 'tree-vect-slp' (loop-aware SLP functionality),
1 'tree-vect-stmts.c' and 'tree-vect-data-refs.c'. Analysis of data
1 references is in 'tree-data-ref.c'.
1
1 SLP Vectorization. This pass performs vectorization of
1 straight-line code. The pass is implemented in 'tree-vectorizer.c'
1 (the main driver), 'tree-vect-slp.c', 'tree-vect-stmts.c' and
1 'tree-vect-data-refs.c'.
1
1 Autoparallelization. This pass splits the loop iteration space to
1 run into several threads. The pass is implemented in
1 'tree-parloops.c'.
1
1 Graphite is a loop transformation framework based on the polyhedral
1 model. Graphite stands for Gimple Represented as Polyhedra. The
1 internals of this infrastructure are documented in
1 <http://gcc.gnu.org/wiki/Graphite>. The passes working on this
1 representation are implemented in the various 'graphite-*' files.
1
1 * Tree level if-conversion for vectorizer
1
1 This pass applies if-conversion to simple loops to help vectorizer.
1 We identify if convertible loops, if-convert statements and merge
1 basic blocks in one big block. The idea is to present loop in such
1 form so that vectorizer can have one to one mapping between
1 statements and available vector operations. This pass is located
1 in 'tree-if-conv.c' and is described by 'pass_if_conversion'.
1
1 * Conditional constant propagation
1
1 This pass relaxes a lattice of values in order to identify those
1 that must be constant even in the presence of conditional branches.
1 The pass is located in 'tree-ssa-ccp.c' and is described by
1 'pass_ccp'.
1
1 A related pass that works on memory loads and stores, and not just
1 register values, is located in 'tree-ssa-ccp.c' and described by
1 'pass_store_ccp'.
1
1 * Conditional copy propagation
1
1 This is similar to constant propagation but the lattice of values
1 is the "copy-of" relation. It eliminates redundant copies from the
1 code. The pass is located in 'tree-ssa-copy.c' and described by
1 'pass_copy_prop'.
1
1 A related pass that works on memory copies, and not just register
1 copies, is located in 'tree-ssa-copy.c' and described by
1 'pass_store_copy_prop'.
1
1 * Value range propagation
1
1 This transformation is similar to constant propagation but instead
1 of propagating single constant values, it propagates known value
1 ranges. The implementation is based on Patterson's range
1 propagation algorithm (Accurate Static Branch Prediction by Value
1 Range Propagation, J. R. C. Patterson, PLDI '95). In contrast to
1 Patterson's algorithm, this implementation does not propagate
1 branch probabilities nor it uses more than a single range per SSA
1 name. This means that the current implementation cannot be used
1 for branch prediction (though adapting it would not be difficult).
1 The pass is located in 'tree-vrp.c' and is described by 'pass_vrp'.
1
1 * Folding built-in functions
1
1 This pass simplifies built-in functions, as applicable, with
1 constant arguments or with inferable string lengths. It is located
1 in 'tree-ssa-ccp.c' and is described by 'pass_fold_builtins'.
1
1 * Split critical edges
1
1 This pass identifies critical edges and inserts empty basic blocks
1 such that the edge is no longer critical. The pass is located in
1 'tree-cfg.c' and is described by 'pass_split_crit_edges'.
1
1 * Control dependence dead code elimination
1
1 This pass is a stronger form of dead code elimination that can
1 eliminate unnecessary control flow statements. It is located in
1 'tree-ssa-dce.c' and is described by 'pass_cd_dce'.
1
1 * Tail call elimination
1
1 This pass identifies function calls that may be rewritten into
1 jumps. No code transformation is actually applied here, but the
1 data and control flow problem is solved. The code transformation
1 requires target support, and so is delayed until RTL. In the
1 meantime 'CALL_EXPR_TAILCALL' is set indicating the possibility.
1 The pass is located in 'tree-tailcall.c' and is described by
1 'pass_tail_calls'. The RTL transformation is handled by
1 'fixup_tail_calls' in 'calls.c'.
1
1 * Warn for function return without value
1
1 For non-void functions, this pass locates return statements that do
1 not specify a value and issues a warning. Such a statement may
1 have been injected by falling off the end of the function. This
1 pass is run last so that we have as much time as possible to prove
1 that the statement is not reachable. It is located in 'tree-cfg.c'
1 and is described by 'pass_warn_function_return'.
1
1 * Leave static single assignment form
1
1 This pass rewrites the function such that it is in normal form. At
1 the same time, we eliminate as many single-use temporaries as
1 possible, so the intermediate language is no longer GIMPLE, but
1 GENERIC. The pass is located in 'tree-outof-ssa.c' and is
1 described by 'pass_del_ssa'.
1
1 * Merge PHI nodes that feed into one another
1
1 This is part of the CFG cleanup passes. It attempts to join PHI
1 nodes from a forwarder CFG block into another block with PHI nodes.
1 The pass is located in 'tree-cfgcleanup.c' and is described by
1 'pass_merge_phi'.
1
1 * Return value optimization
1
1 If a function always returns the same local variable, and that
1 local variable is an aggregate type, then the variable is replaced
1 with the return value for the function (i.e., the function's
1 DECL_RESULT). This is equivalent to the C++ named return value
1 optimization applied to GIMPLE. The pass is located in
1 'tree-nrv.c' and is described by 'pass_nrv'.
1
1 * Return slot optimization
1
1 If a function returns a memory object and is called as 'var =
1 foo()', this pass tries to change the call so that the address of
1 'var' is sent to the caller to avoid an extra memory copy. This
1 pass is located in 'tree-nrv.c' and is described by
1 'pass_return_slot'.
1
1 * Optimize calls to '__builtin_object_size'
1
1 This is a propagation pass similar to CCP that tries to remove
1 calls to '__builtin_object_size' when the size of the object can be
1 computed at compile-time. This pass is located in
1 'tree-object-size.c' and is described by 'pass_object_sizes'.
1
1 * Loop invariant motion
1
1 This pass removes expensive loop-invariant computations out of
1 loops. The pass is located in 'tree-ssa-loop.c' and described by
1 'pass_lim'.
1
1 * Loop nest optimizations
1
1 This is a family of loop transformations that works on loop nests.
1 It includes loop interchange, scaling, skewing and reversal and
1 they are all geared to the optimization of data locality in array
1 traversals and the removal of dependencies that hamper
1 optimizations such as loop parallelization and vectorization. The
1 pass is located in 'tree-loop-linear.c' and described by
1 'pass_linear_transform'.
1
1 * Removal of empty loops
1
1 This pass removes loops with no code in them. The pass is located
1 in 'tree-ssa-loop-ivcanon.c' and described by 'pass_empty_loop'.
1
1 * Unrolling of small loops
1
1 This pass completely unrolls loops with few iterations. The pass
1 is located in 'tree-ssa-loop-ivcanon.c' and described by
1 'pass_complete_unroll'.
1
1 * Predictive commoning
1
1 This pass makes the code reuse the computations from the previous
1 iterations of the loops, especially loads and stores to memory. It
1 does so by storing the values of these computations to a bank of
1 temporary variables that are rotated at the end of loop. To avoid
1 the need for this rotation, the loop is then unrolled and the
1 copies of the loop body are rewritten to use the appropriate
1 version of the temporary variable. This pass is located in
1 'tree-predcom.c' and described by 'pass_predcom'.
1
1 * Array prefetching
1
1 This pass issues prefetch instructions for array references inside
1 loops. The pass is located in 'tree-ssa-loop-prefetch.c' and
1 described by 'pass_loop_prefetch'.
1
1 * Reassociation
1
1 This pass rewrites arithmetic expressions to enable optimizations
1 that operate on them, like redundancy elimination and
1 vectorization. The pass is located in 'tree-ssa-reassoc.c' and
1 described by 'pass_reassoc'.
1
1 * Optimization of 'stdarg' functions
1
1 This pass tries to avoid the saving of register arguments into the
1 stack on entry to 'stdarg' functions. If the function doesn't use
1 any 'va_start' macros, no registers need to be saved. If
1 'va_start' macros are used, the 'va_list' variables don't escape
1 the function, it is only necessary to save registers that will be
1 used in 'va_arg' macros. For instance, if 'va_arg' is only used
1 with integral types in the function, floating point registers don't
1 need to be saved. This pass is located in 'tree-stdarg.c' and
1 described by 'pass_stdarg'.
1