gccint: SSA

1 
1 13.3 Static Single Assignment
1 =============================
1 
1 Most of the tree optimizers rely on the data flow information provided
1 by the Static Single Assignment (SSA) form.  We implement the SSA form
1 as described in 'R. Cytron, J. Ferrante, B. Rosen, M. Wegman, and K.
1 Zadeck. Efficiently Computing Static Single Assignment Form and the
1 Control Dependence Graph. ACM Transactions on Programming Languages and
1 Systems, 13(4):451-490, October 1991'.
1 
1  The SSA form is based on the premise that program variables are
1 assigned in exactly one location in the program.  Multiple assignments
1 to the same variable create new versions of that variable.  Naturally,
1 actual programs are seldom in SSA form initially because variables tend
1 to be assigned multiple times.  The compiler modifies the program
1 representation so that every time a variable is assigned in the code, a
1 new version of the variable is created.  Different versions of the same
1 variable are distinguished by subscripting the variable name with its
1 version number.  Variables used in the right-hand side of expressions
1 are renamed so that their version number matches that of the most recent
1 assignment.
1 
1  We represent variable versions using 'SSA_NAME' nodes.  The renaming
1 process in 'tree-ssa.c' wraps every real and virtual operand with an
1 'SSA_NAME' node which contains the version number and the statement that
1 created the 'SSA_NAME'.  Only definitions and virtual definitions may
1 create new 'SSA_NAME' nodes.
1 
1  Sometimes, flow of control makes it impossible to determine the most
1 recent version of a variable.  In these cases, the compiler inserts an
1 artificial definition for that variable called "PHI function" or "PHI
1 node".  This new definition merges all the incoming versions of the
1 variable to create a new name for it.  For instance,
1 
1      if (...)
1        a_1 = 5;
1      else if (...)
1        a_2 = 2;
1      else
1        a_3 = 13;
1 
1      # a_4 = PHI <a_1, a_2, a_3>
1      return a_4;
1 
1  Since it is not possible to determine which of the three branches will
1 be taken at runtime, we don't know which of 'a_1', 'a_2' or 'a_3' to use
1 at the return statement.  So, the SSA renamer creates a new version
1 'a_4' which is assigned the result of "merging" 'a_1', 'a_2' and 'a_3'.
1 Hence, PHI nodes mean "one of these operands.  I don't know which".
1 
1  The following functions can be used to examine PHI nodes
1 
1  -- Function: gimple_phi_result (PHI)
1      Returns the 'SSA_NAME' created by PHI node PHI (i.e., PHI's LHS).
1 
1  -- Function: gimple_phi_num_args (PHI)
1      Returns the number of arguments in PHI.  This number is exactly the
1      number of incoming edges to the basic block holding PHI.
1 
1  -- Function: gimple_phi_arg (PHI, I)
1      Returns Ith argument of PHI.
1 
1  -- Function: gimple_phi_arg_edge (PHI, I)
1      Returns the incoming edge for the Ith argument of PHI.
1 
1  -- Function: gimple_phi_arg_def (PHI, I)
1      Returns the 'SSA_NAME' for the Ith argument of PHI.
1 
1 13.3.1 Preserving the SSA form
1 ------------------------------
1 
1 Some optimization passes make changes to the function that invalidate
1 the SSA property.  This can happen when a pass has added new symbols or
1 changed the program so that variables that were previously aliased
1 aren't anymore.  Whenever something like this happens, the affected
1 symbols must be renamed into SSA form again.  Transformations that emit
1 new code or replicate existing statements will also need to update the
1 SSA form.
1 
1  Since GCC implements two different SSA forms for register and virtual
1 variables, keeping the SSA form up to date depends on whether you are
1 updating register or virtual names.  In both cases, the general idea
1 behind incremental SSA updates is similar: when new SSA names are
1 created, they typically are meant to replace other existing names in the
1 program.
1 
1  For instance, given the following code:
1 
1           1  L0:
1           2  x_1 = PHI (0, x_5)
1           3  if (x_1 < 10)
1           4    if (x_1 > 7)
1           5      y_2 = 0
1           6    else
1           7      y_3 = x_1 + x_7
1           8    endif
1           9    x_5 = x_1 + 1
1           10   goto L0;
1           11 endif
1 
1  Suppose that we insert new names 'x_10' and 'x_11' (lines '4' and '8').
1 
1           1  L0:
1           2  x_1 = PHI (0, x_5)
1           3  if (x_1 < 10)
1           4    x_10 = ...
1           5    if (x_1 > 7)
1           6      y_2 = 0
1           7    else
1           8      x_11 = ...
1           9      y_3 = x_1 + x_7
1           10   endif
1           11   x_5 = x_1 + 1
1           12   goto L0;
1           13 endif
1 
1  We want to replace all the uses of 'x_1' with the new definitions of
1 'x_10' and 'x_11'.  Note that the only uses that should be replaced are
1 those at lines '5', '9' and '11'.  Also, the use of 'x_7' at line '9'
1 should _not_ be replaced (this is why we cannot just mark symbol 'x' for
1 renaming).
1 
1  Additionally, we may need to insert a PHI node at line '11' because
1 that is a merge point for 'x_10' and 'x_11'.  So the use of 'x_1' at
1 line '11' will be replaced with the new PHI node.  The insertion of PHI
1 nodes is optional.  They are not strictly necessary to preserve the SSA
1 form, and depending on what the caller inserted, they may not even be
1 useful for the optimizers.
1 
1  Updating the SSA form is a two step process.  First, the pass has to
1 identify which names need to be updated and/or which symbols need to be
1 renamed into SSA form for the first time.  When new names are introduced
1 to replace existing names in the program, the mapping between the old
1 and the new names are registered by calling 'register_new_name_mapping'
1 (note that if your pass creates new code by duplicating basic blocks,
1 the call to 'tree_duplicate_bb' will set up the necessary mappings
1 automatically).
1 
1  After the replacement mappings have been registered and new symbols
1 marked for renaming, a call to 'update_ssa' makes the registered
1 changes.  This can be done with an explicit call or by creating 'TODO'
1 flags in the 'tree_opt_pass' structure for your pass.  There are several
1 'TODO' flags that control the behavior of 'update_ssa':
1 
1    * 'TODO_update_ssa'.  Update the SSA form inserting PHI nodes for
1      newly exposed symbols and virtual names marked for updating.  When
1      updating real names, only insert PHI nodes for a real name 'O_j' in
1      blocks reached by all the new and old definitions for 'O_j'.  If
1      the iterated dominance frontier for 'O_j' is not pruned, we may end
1      up inserting PHI nodes in blocks that have one or more edges with
1      no incoming definition for 'O_j'.  This would lead to uninitialized
1      warnings for 'O_j''s symbol.
1 
1    * 'TODO_update_ssa_no_phi'.  Update the SSA form without inserting
1      any new PHI nodes at all.  This is used by passes that have either
1      inserted all the PHI nodes themselves or passes that need only to
1      patch use-def and def-def chains for virtuals (e.g., DCE).
1 
1    * 'TODO_update_ssa_full_phi'.  Insert PHI nodes everywhere they are
1      needed.  No pruning of the IDF is done.  This is used by passes
1      that need the PHI nodes for 'O_j' even if it means that some
1      arguments will come from the default definition of 'O_j''s symbol
1      (e.g., 'pass_linear_transform').
1 
1      WARNING: If you need to use this flag, chances are that your pass
1      may be doing something wrong.  Inserting PHI nodes for an old name
1      where not all edges carry a new replacement may lead to silent
1      codegen errors or spurious uninitialized warnings.
1 
1    * 'TODO_update_ssa_only_virtuals'.  Passes that update the SSA form
1      on their own may want to delegate the updating of virtual names to
1      the generic updater.  Since FUD chains are easier to maintain, this
1      simplifies the work they need to do.  NOTE: If this flag is used,
1      any OLD->NEW mappings for real names are explicitly destroyed and
1      only the symbols marked for renaming are processed.
1 
1 13.3.2 Examining 'SSA_NAME' nodes
1 ---------------------------------
1 
1 The following macros can be used to examine 'SSA_NAME' nodes
1 
1  -- Macro: SSA_NAME_DEF_STMT (VAR)
1      Returns the statement S that creates the 'SSA_NAME' VAR.  If S is
1      an empty statement (i.e., 'IS_EMPTY_STMT (S)' returns 'true'), it
1      means that the first reference to this variable is a USE or a VUSE.
1 
1  -- Macro: SSA_NAME_VERSION (VAR)
1      Returns the version number of the 'SSA_NAME' object VAR.
1 
1 13.3.3 Walking the dominator tree
1 ---------------------------------
1 
1  -- Tree SSA function: void walk_dominator_tree (WALK_DATA, BB)
1 
1      This function walks the dominator tree for the current CFG calling
1      a set of callback functions defined in STRUCT DOM_WALK_DATA in
1      'domwalk.h'.  The call back functions you need to define give you
1      hooks to execute custom code at various points during traversal:
1 
1        1. Once to initialize any local data needed while processing BB
1           and its children.  This local data is pushed into an internal
1           stack which is automatically pushed and popped as the walker
1           traverses the dominator tree.
1 
1        2. Once before traversing all the statements in the BB.
1 
1        3. Once for every statement inside BB.
1 
1        4. Once after traversing all the statements and before recursing
1           into BB's dominator children.
1 
1        5. It then recurses into all the dominator children of BB.
1 
1        6. After recursing into all the dominator children of BB it can,
1           optionally, traverse every statement in BB again (i.e.,
1           repeating steps 2 and 3).
1 
1        7. Once after walking the statements in BB and BB's dominator
1           children.  At this stage, the block local data stack is
1           popped.
1