gccint: define_peephole2

1 
1 17.18.2 RTL to RTL Peephole Optimizers
1 --------------------------------------
1 
1 The 'define_peephole2' definition tells the compiler how to substitute
1 one sequence of instructions for another sequence, what additional
1 scratch registers may be needed and what their lifetimes must be.
1 
1      (define_peephole2
1        [INSN-PATTERN-1
1         INSN-PATTERN-2
1         ...]
1        "CONDITION"
1        [NEW-INSN-PATTERN-1
1         NEW-INSN-PATTERN-2
1         ...]
1        "PREPARATION-STATEMENTS")
1 
1 Splitting::) except that the pattern to match is not a single
1 instruction, but a sequence of instructions.
1 
1  It is possible to request additional scratch registers for use in the
1 output template.  If appropriate registers are not free, the pattern
1 will simply not match.
1 
1  Scratch registers are requested with a 'match_scratch' pattern at the
1 top level of the input pattern.  The allocated register (initially) will
1 be dead at the point requested within the original sequence.  If the
1 scratch is used at more than a single point, a 'match_dup' pattern at
1 the top level of the input pattern marks the last position in the input
1 sequence at which the register must be available.
1 
1  Here is an example from the IA-32 machine description:
1 
1      (define_peephole2
1        [(match_scratch:SI 2 "r")
1         (parallel [(set (match_operand:SI 0 "register_operand" "")
1                         (match_operator:SI 3 "arith_or_logical_operator"
1                           [(match_dup 0)
1                            (match_operand:SI 1 "memory_operand" "")]))
1                    (clobber (reg:CC 17))])]
1        "! optimize_size && ! TARGET_READ_MODIFY"
1        [(set (match_dup 2) (match_dup 1))
1         (parallel [(set (match_dup 0)
1                         (match_op_dup 3 [(match_dup 0) (match_dup 2)]))
1                    (clobber (reg:CC 17))])]
1        "")
1 
1 This pattern tries to split a load from its use in the hopes that we'll
1 be able to schedule around the memory load latency.  It allocates a
1 single 'SImode' register of class 'GENERAL_REGS' ('"r"') that needs to
1 be live only at the point just before the arithmetic.
1 
1  A real example requiring extended scratch lifetimes is harder to come
1 by, so here's a silly made-up example:
1 
1      (define_peephole2
1        [(match_scratch:SI 4 "r")
1         (set (match_operand:SI 0 "" "") (match_operand:SI 1 "" ""))
1         (set (match_operand:SI 2 "" "") (match_dup 1))
1         (match_dup 4)
1         (set (match_operand:SI 3 "" "") (match_dup 1))]
1        "/* determine 1 does not overlap 0 and 2 */"
1        [(set (match_dup 4) (match_dup 1))
1         (set (match_dup 0) (match_dup 4))
1         (set (match_dup 2) (match_dup 4))
1         (set (match_dup 3) (match_dup 4))]
1        "")
1 
1 If we had not added the '(match_dup 4)' in the middle of the input
1 sequence, it might have been the case that the register we chose at the
1 beginning of the sequence is killed by the first or second 'set'.
1