gccint: Insn Splitting

1 
1 17.16 Defining How to Split Instructions
1 ========================================
1 
1 There are two cases where you should specify how to split a pattern into
1 multiple insns.  On machines that have instructions requiring delay
1 slots (⇒Delay Slots) or that have instructions whose output is
11 not available for multiple cycles (⇒Processor pipeline
 description), the compiler phases that optimize these cases need to be
1 able to move insns into one-instruction delay slots.  However, some
1 insns may generate more than one machine instruction.  These insns
1 cannot be placed into a delay slot.
1 
1  Often you can rewrite the single insn as a list of individual insns,
1 each corresponding to one machine instruction.  The disadvantage of
1 doing so is that it will cause the compilation to be slower and require
1 more space.  If the resulting insns are too complex, it may also
1 suppress some optimizations.  The compiler splits the insn if there is a
1 reason to believe that it might improve instruction or delay slot
1 scheduling.
1 
1  The insn combiner phase also splits putative insns.  If three insns are
1 merged into one insn with a complex expression that cannot be matched by
1 some 'define_insn' pattern, the combiner phase attempts to split the
1 complex pattern into two insns that are recognized.  Usually it can
1 break the complex pattern into two patterns by splitting out some
1 subexpression.  However, in some other cases, such as performing an
1 addition of a large constant in two insns on a RISC machine, the way to
1 split the addition into two insns is machine-dependent.
1 
1  The 'define_split' definition tells the compiler how to split a complex
1 insn into several simpler insns.  It looks like this:
1 
1      (define_split
1        [INSN-PATTERN]
1        "CONDITION"
1        [NEW-INSN-PATTERN-1
1         NEW-INSN-PATTERN-2
1         ...]
1        "PREPARATION-STATEMENTS")
1 
1  INSN-PATTERN is a pattern that needs to be split and CONDITION is the
1 final condition to be tested, as in a 'define_insn'.  When an insn
1 matching INSN-PATTERN and satisfying CONDITION is found, it is replaced
1 in the insn list with the insns given by NEW-INSN-PATTERN-1,
1 NEW-INSN-PATTERN-2, etc.
1 
1  The PREPARATION-STATEMENTS are similar to those statements that are
1 specified for 'define_expand' (⇒Expander Definitions) and are
1 executed before the new RTL is generated to prepare for the generated
1 code or emit some insns whose pattern is not fixed.  Unlike those in
1 'define_expand', however, these statements must not generate any new
1 pseudo-registers.  Once reload has completed, they also must not
1 allocate any space in the stack frame.
1 
1  Patterns are matched against INSN-PATTERN in two different
1 circumstances.  If an insn needs to be split for delay slot scheduling
1 or insn scheduling, the insn is already known to be valid, which means
1 that it must have been matched by some 'define_insn' and, if
1 'reload_completed' is nonzero, is known to satisfy the constraints of
1 that 'define_insn'.  In that case, the new insn patterns must also be
1 insns that are matched by some 'define_insn' and, if 'reload_completed'
1 is nonzero, must also satisfy the constraints of those definitions.
1 
1  As an example of this usage of 'define_split', consider the following
1 example from 'a29k.md', which splits a 'sign_extend' from 'HImode' to
1 'SImode' into a pair of shift insns:
1 
1      (define_split
1        [(set (match_operand:SI 0 "gen_reg_operand" "")
1              (sign_extend:SI (match_operand:HI 1 "gen_reg_operand" "")))]
1        ""
1        [(set (match_dup 0)
1              (ashift:SI (match_dup 1)
1                         (const_int 16)))
1         (set (match_dup 0)
1              (ashiftrt:SI (match_dup 0)
1                           (const_int 16)))]
1        "
1      { operands[1] = gen_lowpart (SImode, operands[1]); }")
1 
1  When the combiner phase tries to split an insn pattern, it is always
1 the case that the pattern is _not_ matched by any 'define_insn'.  The
1 combiner pass first tries to split a single 'set' expression and then
1 the same 'set' expression inside a 'parallel', but followed by a
1 'clobber' of a pseudo-reg to use as a scratch register.  In these cases,
1 the combiner expects exactly two new insn patterns to be generated.  It
1 will verify that these patterns match some 'define_insn' definitions, so
1 you need not do this test in the 'define_split' (of course, there is no
1 point in writing a 'define_split' that will never produce insns that
1 match).
1 
1  Here is an example of this use of 'define_split', taken from
1 'rs6000.md':
1 
1      (define_split
1        [(set (match_operand:SI 0 "gen_reg_operand" "")
1              (plus:SI (match_operand:SI 1 "gen_reg_operand" "")
1                       (match_operand:SI 2 "non_add_cint_operand" "")))]
1        ""
1        [(set (match_dup 0) (plus:SI (match_dup 1) (match_dup 3)))
1         (set (match_dup 0) (plus:SI (match_dup 0) (match_dup 4)))]
1      "
1      {
1        int low = INTVAL (operands[2]) & 0xffff;
1        int high = (unsigned) INTVAL (operands[2]) >> 16;
1 
1        if (low & 0x8000)
1          high++, low |= 0xffff0000;
1 
1        operands[3] = GEN_INT (high << 16);
1        operands[4] = GEN_INT (low);
1      }")
1 
1  Here the predicate 'non_add_cint_operand' matches any 'const_int' that
1 is _not_ a valid operand of a single add insn.  The add with the smaller
1 displacement is written so that it can be substituted into the address
1 of a subsequent operation.
1 
1  An example that uses a scratch register, from the same file, generates
1 an equality comparison of a register and a large constant:
1 
1      (define_split
1        [(set (match_operand:CC 0 "cc_reg_operand" "")
1              (compare:CC (match_operand:SI 1 "gen_reg_operand" "")
1                          (match_operand:SI 2 "non_short_cint_operand" "")))
1         (clobber (match_operand:SI 3 "gen_reg_operand" ""))]
1        "find_single_use (operands[0], insn, 0)
1         && (GET_CODE (*find_single_use (operands[0], insn, 0)) == EQ
1             || GET_CODE (*find_single_use (operands[0], insn, 0)) == NE)"
1        [(set (match_dup 3) (xor:SI (match_dup 1) (match_dup 4)))
1         (set (match_dup 0) (compare:CC (match_dup 3) (match_dup 5)))]
1        "
1      {
1        /* Get the constant we are comparing against, C, and see what it
1           looks like sign-extended to 16 bits.  Then see what constant
1           could be XOR'ed with C to get the sign-extended value.  */
1 
1        int c = INTVAL (operands[2]);
1        int sextc = (c << 16) >> 16;
1        int xorv = c ^ sextc;
1 
1        operands[4] = GEN_INT (xorv);
1        operands[5] = GEN_INT (sextc);
1      }")
1 
1  To avoid confusion, don't write a single 'define_split' that accepts
1 some insns that match some 'define_insn' as well as some insns that
1 don't.  Instead, write two separate 'define_split' definitions, one for
1 the insns that are valid and one for the insns that are not valid.
1 
1  The splitter is allowed to split jump instructions into sequence of
1 jumps or create new jumps in while splitting non-jump instructions.  As
1 the control flow graph and branch prediction information needs to be
1 updated, several restriction apply.
1 
1  Splitting of jump instruction into sequence that over by another jump
1 instruction is always valid, as compiler expect identical behavior of
1 new jump.  When new sequence contains multiple jump instructions or new
1 labels, more assistance is needed.  Splitter is required to create only
1 unconditional jumps, or simple conditional jump instructions.
1 Additionally it must attach a 'REG_BR_PROB' note to each conditional
1 jump.  A global variable 'split_branch_probability' holds the
1 probability of the original branch in case it was a simple conditional
1 jump, -1 otherwise.  To simplify recomputing of edge frequencies, the
1 new sequence is required to have only forward jumps to the newly created
1 labels.
1 
1  For the common case where the pattern of a define_split exactly matches
1 the pattern of a define_insn, use 'define_insn_and_split'.  It looks
1 like this:
1 
1      (define_insn_and_split
1        [INSN-PATTERN]
1        "CONDITION"
1        "OUTPUT-TEMPLATE"
1        "SPLIT-CONDITION"
1        [NEW-INSN-PATTERN-1
1         NEW-INSN-PATTERN-2
1         ...]
1        "PREPARATION-STATEMENTS"
1        [INSN-ATTRIBUTES])
1 
1 
1  INSN-PATTERN, CONDITION, OUTPUT-TEMPLATE, and INSN-ATTRIBUTES are used
1 as in 'define_insn'.  The NEW-INSN-PATTERN vector and the
1 PREPARATION-STATEMENTS are used as in a 'define_split'.  The
1 SPLIT-CONDITION is also used as in 'define_split', with the additional
1 behavior that if the condition starts with '&&', the condition used for
1 the split will be the constructed as a logical "and" of the split
1 condition with the insn condition.  For example, from i386.md:
1 
1      (define_insn_and_split "zero_extendhisi2_and"
1        [(set (match_operand:SI 0 "register_operand" "=r")
1           (zero_extend:SI (match_operand:HI 1 "register_operand" "0")))
1         (clobber (reg:CC 17))]
1        "TARGET_ZERO_EXTEND_WITH_AND && !optimize_size"
1        "#"
1        "&& reload_completed"
1        [(parallel [(set (match_dup 0)
1                         (and:SI (match_dup 0) (const_int 65535)))
1                    (clobber (reg:CC 17))])]
1        ""
1        [(set_attr "type" "alu1")])
1 
1 
1  In this case, the actual split condition will be
1 'TARGET_ZERO_EXTEND_WITH_AND && !optimize_size && reload_completed'.
1 
1  The 'define_insn_and_split' construction provides exactly the same
1 functionality as two separate 'define_insn' and 'define_split' patterns.
1 It exists for compactness, and as a maintenance tool to prevent having
1 to ensure the two patterns' templates match.
1