gccint: Dependency analysis
1
1 16.8 Data Dependency Analysis
1 =============================
1
1 The code for the data dependence analysis can be found in
1 'tree-data-ref.c' and its interface and data structures are described in
1 'tree-data-ref.h'. The function that computes the data dependences for
1 all the array and pointer references for a given loop is
1 'compute_data_dependences_for_loop'. This function is currently used by
1 the linear loop transform and the vectorization passes. Before calling
1 this function, one has to allocate two vectors: a first vector will
1 contain the set of data references that are contained in the analyzed
1 loop body, and the second vector will contain the dependence relations
1 between the data references. Thus if the vector of data references is
1 of size 'n', the vector containing the dependence relations will contain
1 'n*n' elements. However if the analyzed loop contains side effects,
1 such as calls that potentially can interfere with the data references in
1 the current analyzed loop, the analysis stops while scanning the loop
1 body for data references, and inserts a single 'chrec_dont_know' in the
1 dependence relation array.
1
1 The data references are discovered in a particular order during the
1 scanning of the loop body: the loop body is analyzed in execution order,
1 and the data references of each statement are pushed at the end of the
1 data reference array. Two data references syntactically occur in the
1 program in the same order as in the array of data references. This
1 syntactic order is important in some classical data dependence tests,
1 and mapping this order to the elements of this array avoids costly
1 queries to the loop body representation.
1
1 Three types of data references are currently handled: ARRAY_REF,
1 INDIRECT_REF and COMPONENT_REF. The data structure for the data
1 reference is 'data_reference', where 'data_reference_p' is a name of a
1 pointer to the data reference structure. The structure contains the
1 following elements:
1
1 * 'base_object_info': Provides information about the base object of
1 the data reference and its access functions. These access
1 functions represent the evolution of the data reference in the loop
1 relative to its base, in keeping with the classical meaning of the
1 data reference access function for the support of arrays. For
1 example, for a reference 'a.b[i][j]', the base object is 'a.b' and
1 the access functions, one for each array subscript, are: '{i_init,
1 + i_step}_1, {j_init, +, j_step}_2'.
1
1 * 'first_location_in_loop': Provides information about the first
1 location accessed by the data reference in the loop and about the
1 access function used to represent evolution relative to this
1 location. This data is used to support pointers, and is not used
1 for arrays (for which we have base objects). Pointer accesses are
1 represented as a one-dimensional access that starts from the first
1 location accessed in the loop. For example:
1
1 for1 i
1 for2 j
1 *((int *)p + i + j) = a[i][j];
1
1 The access function of the pointer access is '{0, + 4B}_for2'
1 relative to 'p + i'. The access functions of the array are
1 '{i_init, + i_step}_for1' and '{j_init, +, j_step}_for2' relative
1 to 'a'.
1
1 Usually, the object the pointer refers to is either unknown, or we
1 cannot prove that the access is confined to the boundaries of a
1 certain object.
1
1 Two data references can be compared only if at least one of these
1 two representations has all its fields filled for both data
1 references.
1
1 The current strategy for data dependence tests is as follows: If
1 both 'a' and 'b' are represented as arrays, compare 'a.base_object'
1 and 'b.base_object'; if they are equal, apply dependence tests (use
1 access functions based on base_objects). Else if both 'a' and 'b'
1 are represented as pointers, compare 'a.first_location' and
1 'b.first_location'; if they are equal, apply dependence tests (use
1 access functions based on first location). However, if 'a' and 'b'
1 are represented differently, only try to prove that the bases are
1 definitely different.
1
1 * Aliasing information.
1 * Alignment information.
1
1 The structure describing the relation between two data references is
1 'data_dependence_relation' and the shorter name for a pointer to such a
1 structure is 'ddr_p'. This structure contains:
1
1 * a pointer to each data reference,
1 * a tree node 'are_dependent' that is set to 'chrec_known' if the
1 analysis has proved that there is no dependence between these two
1 data references, 'chrec_dont_know' if the analysis was not able to
1 determine any useful result and potentially there could exist a
1 dependence between these data references, and 'are_dependent' is
1 set to 'NULL_TREE' if there exist a dependence relation between the
1 data references, and the description of this dependence relation is
1 given in the 'subscripts', 'dir_vects', and 'dist_vects' arrays,
1 * a boolean that determines whether the dependence relation can be
1 represented by a classical distance vector,
1 * an array 'subscripts' that contains a description of each subscript
1 of the data references. Given two array accesses a subscript is
1 the tuple composed of the access functions for a given dimension.
1 For example, given 'A[f1][f2][f3]' and 'B[g1][g2][g3]', there are
1 three subscripts: '(f1, g1), (f2, g2), (f3, g3)'.
1 * two arrays 'dir_vects' and 'dist_vects' that contain classical
1 representations of the data dependences under the form of direction
1 and distance dependence vectors,
1 * an array of loops 'loop_nest' that contains the loops to which the
1 distance and direction vectors refer to.
1
1 Several functions for pretty printing the information extracted by the
1 data dependence analysis are available: 'dump_ddrs' prints with a
1 maximum verbosity the details of a data dependence relations array,
1 'dump_dist_dir_vectors' prints only the classical distance and direction
1 vectors for a data dependence relations array, and
1 'dump_data_references' prints the details of the data references
1 contained in a data reference array.
1