gccint: Basic Blocks
1
1 15.1 Basic Blocks
1 =================
1
1 A basic block is a straight-line sequence of code with only one entry
1 point and only one exit. In GCC, basic blocks are represented using the
1 'basic_block' data type.
1
1 Special basic blocks represent possible entry and exit points of a
1 function. These blocks are called 'ENTRY_BLOCK_PTR' and
1 'EXIT_BLOCK_PTR'. These blocks do not contain any code.
1
1 The 'BASIC_BLOCK' array contains all basic blocks in an unspecified
1 order. Each 'basic_block' structure has a field that holds a unique
1 integer identifier 'index' that is the index of the block in the
1 'BASIC_BLOCK' array. The total number of basic blocks in the function
1 is 'n_basic_blocks'. Both the basic block indices and the total number
1 of basic blocks may vary during the compilation process, as passes
1 reorder, create, duplicate, and destroy basic blocks. The index for any
1 block should never be greater than 'last_basic_block'. The indices 0
1 and 1 are special codes reserved for 'ENTRY_BLOCK' and 'EXIT_BLOCK', the
1 indices of 'ENTRY_BLOCK_PTR' and 'EXIT_BLOCK_PTR'.
1
1 Two pointer members of the 'basic_block' structure are the pointers
1 'next_bb' and 'prev_bb'. These are used to keep doubly linked chain of
1 basic blocks in the same order as the underlying instruction stream.
1 The chain of basic blocks is updated transparently by the provided API
1 for manipulating the CFG. The macro 'FOR_EACH_BB' can be used to visit
1 all the basic blocks in lexicographical order, except 'ENTRY_BLOCK' and
1 'EXIT_BLOCK'. The macro 'FOR_ALL_BB' also visits all basic blocks in
1 lexicographical order, including 'ENTRY_BLOCK' and 'EXIT_BLOCK'.
1
1 The functions 'post_order_compute' and 'inverted_post_order_compute'
1 can be used to compute topological orders of the CFG. The orders are
1 stored as vectors of basic block indices. The 'BASIC_BLOCK' array can
1 be used to iterate each basic block by index. Dominator traversals are
1 also possible using 'walk_dominator_tree'. Given two basic blocks A and
1 B, block A dominates block B if A is _always_ executed before B.
1
1 Each 'basic_block' also contains pointers to the first instruction (the
1 "head") and the last instruction (the "tail") or "end" of the
1 instruction stream contained in a basic block. In fact, since the
1 'basic_block' data type is used to represent blocks in both major
1 intermediate representations of GCC ('GIMPLE' and RTL), there are
1 pointers to the head and end of a basic block for both representations,
1 stored in intermediate representation specific data in the 'il' field of
1 'struct basic_block_def'.
1
1 For RTL, these pointers are 'BB_HEAD' and 'BB_END'.
1
1 In the RTL representation of a function, the instruction stream
1 contains not only the "real" instructions, but also "notes" or "insn
1 notes" (to distinguish them from "reg notes"). Any function that moves
1 or duplicates the basic blocks needs to take care of updating of these
1 notes. Many of these notes expect that the instruction stream consists
1 of linear regions, so updating can sometimes be tedious. All types of
1 insn notes are defined in 'insn-notes.def'.
1
1 In the RTL function representation, the instructions contained in a
1 basic block always follow a 'NOTE_INSN_BASIC_BLOCK', but zero or more
1 'CODE_LABEL' nodes can precede the block note. A basic block ends with
1 a control flow instruction or with the last instruction before the next
1 'CODE_LABEL' or 'NOTE_INSN_BASIC_BLOCK'. By definition, a 'CODE_LABEL'
1 cannot appear in the middle of the instruction stream of a basic block.
1
1 In addition to notes, the jump table vectors are also represented as
1 "pseudo-instructions" inside the insn stream. These vectors never
1 appear in the basic block and should always be placed just after the
1 table jump instructions referencing them. After removing the table-jump
1 it is often difficult to eliminate the code computing the address and
1 referencing the vector, so cleaning up these vectors is postponed until
1 after liveness analysis. Thus the jump table vectors may appear in the
1 insn stream unreferenced and without any purpose. Before any edge is
1 made "fall-thru", the existence of such construct in the way needs to be
1 checked by calling 'can_fallthru' function.
1
1 For the 'GIMPLE' representation, the PHI nodes and statements contained
1 in a basic block are in a 'gimple_seq' pointed to by the basic block
1 intermediate language specific pointers. Abstract containers and
1 iterators are used to access the PHI nodes and statements in a basic
1 blocks. These iterators are called "GIMPLE statement iterators" (GSIs).
1 Grep for '^gsi' in the various 'gimple-*' and 'tree-*' files. There is
1 a 'gimple_stmt_iterator' type for iterating over all kinds of statement,
1 and a 'gphi_iterator' subclass for iterating over PHI nodes. The
1 following snippet will pretty-print all PHI nodes the statements of the
1 current function in the GIMPLE representation.
1
1 basic_block bb;
1
1 FOR_EACH_BB (bb)
1 {
1 gphi_iterator pi;
1 gimple_stmt_iterator si;
1
1 for (pi = gsi_start_phis (bb); !gsi_end_p (pi); gsi_next (&pi))
1 {
1 gphi *phi = pi.phi ();
1 print_gimple_stmt (dump_file, phi, 0, TDF_SLIM);
1 }
1 for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))
1 {
1 gimple stmt = gsi_stmt (si);
1 print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
1 }
1 }
1