gccint: The Language
1
1 26.2 The Language
1 =================
1
1 The language to write expression simplifications in resembles other
1 domain-specific languages GCC uses. Thus it is lispy. Lets start with
1 an example from the match.pd file:
1
1 (simplify
1 (bit_and @0 integer_all_onesp)
1 @0)
1
1 This example contains all required parts of an expression
1 simplification. A simplification is wrapped inside a '(simplify ...)'
1 expression. That contains at least two operands - an expression that is
1 matched with the GIMPLE or GENERIC IL and a replacement expression that
1 is returned if the match was successful.
1
1 Expressions have an operator ID, 'bit_and' in this case. Expressions
1 can be lower-case tree codes with '_expr' stripped off or builtin
1 function code names in all-caps, like 'BUILT_IN_SQRT'.
1
1 '@n' denotes a so-called capture. It captures the operand and lets you
1 refer to it in other places of the match-and-simplify. In the above
1 example it is refered to in the replacement expression. Captures are
1 '@' followed by a number or an identifier.
1
1 (simplify
1 (bit_xor @0 @0)
1 { build_zero_cst (type); })
1
1 In this example '@0' is mentioned twice which constrains the matched
1 expression to have two equal operands. Usually matches are constraint
1 to equal types. If operands may be constants and conversions are
1 involved matching by value might be preferred in which case use '@@0' to
1 denote a by value match and the specific operand you want to refer to in
1 the result part. This example also introduces operands written in C
1 code. These can be used in the expression replacements and are supposed
1 to evaluate to a tree node which has to be a valid GIMPLE operand (so
1 you cannot generate expressions in C code).
1
1 (simplify
1 (trunc_mod integer_zerop@0 @1)
1 (if (!integer_zerop (@1))
1 @0))
1
1 Here '@0' captures the first operand of the trunc_mod expression which
1 is also predicated with 'integer_zerop'. Expression operands may be
1 either expressions, predicates or captures. Captures can be
1 unconstrained or capture expresions or predicates.
1
1 This example introduces an optional operand of simplify, the
1 if-expression. This condition is evaluated after the expression matched
1 in the IL and is required to evaluate to true to enable the replacement
1 expression in the second operand position. The expression operand of
1 the 'if' is a standard C expression which may contain references to
1 captures. The 'if' has an optional third operand which may contain the
1 replacement expression that is enabled when the condition evaluates to
1 false.
1
1 A 'if' expression can be used to specify a common condition for
1 multiple simplify patterns, avoiding the need to repeat that multiple
1 times:
1
1 (if (!TYPE_SATURATING (type)
1 && !FLOAT_TYPE_P (type) && !FIXED_POINT_TYPE_P (type))
1 (simplify
1 (minus (plus @0 @1) @0)
1 @1)
1 (simplify
1 (minus (minus @0 @1) @0)
1 (negate @1)))
1
1 Note that 'if's in outer position do not have the optional else clause
1 but instead have multiple then clauses.
1
1 Ifs can be nested.
1
1 There exists a 'switch' expression which can be used to chain
1 conditions avoiding nesting 'if's too much:
1
1 (simplify
1 (simple_comparison @0 REAL_CST@1)
1 (switch
1 /* a CMP (-0) -> a CMP 0 */
1 (if (REAL_VALUE_MINUS_ZERO (TREE_REAL_CST (@1)))
1 (cmp @0 { build_real (TREE_TYPE (@1), dconst0); }))
1 /* x != NaN is always true, other ops are always false. */
1 (if (REAL_VALUE_ISNAN (TREE_REAL_CST (@1))
1 && ! HONOR_SNANS (@1))
1 { constant_boolean_node (cmp == NE_EXPR, type); })))
1
1 Is equal to
1
1 (simplify
1 (simple_comparison @0 REAL_CST@1)
1 (switch
1 /* a CMP (-0) -> a CMP 0 */
1 (if (REAL_VALUE_MINUS_ZERO (TREE_REAL_CST (@1)))
1 (cmp @0 { build_real (TREE_TYPE (@1), dconst0); })
1 /* x != NaN is always true, other ops are always false. */
1 (if (REAL_VALUE_ISNAN (TREE_REAL_CST (@1))
1 && ! HONOR_SNANS (@1))
1 { constant_boolean_node (cmp == NE_EXPR, type); }))))
1
1 which has the second 'if' in the else operand of the first. The
1 'switch' expression takes 'if' expressions as operands (which may not
1 have else clauses) and as a last operand a replacement expression which
1 should be enabled by default if no other condition evaluated to true.
1
1 Captures can also be used for capturing results of sub-expressions.
1
1 #if GIMPLE
1 (simplify
1 (pointer_plus (addr@2 @0) INTEGER_CST_P@1)
1 (if (is_gimple_min_invariant (@2)))
1 {
1 poly_int64 off;
1 tree base = get_addr_base_and_unit_offset (@0, &off);
1 off += tree_to_uhwi (@1);
1 /* Now with that we should be able to simply write
1 (addr (mem_ref (addr @base) (plus @off @1))) */
1 build1 (ADDR_EXPR, type,
1 build2 (MEM_REF, TREE_TYPE (TREE_TYPE (@2)),
1 build_fold_addr_expr (base),
1 build_int_cst (ptr_type_node, off)));
1 })
1 #endif
1
1 In the above example, '@2' captures the result of the expression '(addr
1 @0)'. For outermost expression only its type can be captured, and the
1 keyword 'type' is reserved for this purpose. The above example also
1 gives a way to conditionalize patterns to only apply to 'GIMPLE' or
1 'GENERIC' by means of using the pre-defined preprocessor macros 'GIMPLE'
1 and 'GENERIC' and using preprocessor directives.
1
1 (simplify
1 (bit_and:c integral_op_p@0 (bit_ior:c (bit_not @0) @1))
1 (bit_and @1 @0))
1
1 Here we introduce flags on match expressions. The flag used above,
1 'c', denotes that the expression should be also matched commutated.
1 Thus the above match expression is really the following four match
1 expressions:
1
1 (bit_and integral_op_p@0 (bit_ior (bit_not @0) @1))
1 (bit_and (bit_ior (bit_not @0) @1) integral_op_p@0)
1 (bit_and integral_op_p@0 (bit_ior @1 (bit_not @0)))
1 (bit_and (bit_ior @1 (bit_not @0)) integral_op_p@0)
1
1 Usual canonicalizations you know from GENERIC expressions are applied
1 before matching, so for example constant operands always come second in
1 commutative expressions.
1
1 The second supported flag is 's' which tells the code generator to fail
1 the pattern if the expression marked with 's' does have more than one
1 use. For example in
1
1 (simplify
1 (pointer_plus (pointer_plus:s @0 @1) @3)
1 (pointer_plus @0 (plus @1 @3)))
1
1 this avoids the association if '(pointer_plus @0 @1)' is used outside
1 of the matched expression and thus it would stay live and not trivially
1 removed by dead code elimination.
1
1 More features exist to avoid too much repetition.
1
1 (for op (plus pointer_plus minus bit_ior bit_xor)
1 (simplify
1 (op @0 integer_zerop)
1 @0))
1
1 A 'for' expression can be used to repeat a pattern for each operator
1 specified, substituting 'op'. 'for' can be nested and a 'for' can have
1 multiple operators to iterate.
1
1 (for opa (plus minus)
1 opb (minus plus)
1 (for opc (plus minus)
1 (simplify...
1
1 In this example the pattern will be repeated four times with 'opa, opb,
1 opc' being 'plus, minus, plus', 'plus, minus, minus', 'minus, plus,
1 plus', 'minus, plus, minus'.
1
1 To avoid repeating operator lists in 'for' you can name them via
1
1 (define_operator_list pmm plus minus mult)
1
1 and use them in 'for' operator lists where they get expanded.
1
1 (for opa (pmm trunc_div)
1 (simplify...
1
1 So this example iterates over 'plus', 'minus', 'mult' and 'trunc_div'.
1
1 Using operator lists can also remove the need to explicitely write a
1 'for'. All operator list uses that appear in a 'simplify' or 'match'
1 pattern in operator positions will implicitely be added to a new 'for'.
1 For example
1
1 (define_operator_list SQRT BUILT_IN_SQRTF BUILT_IN_SQRT BUILT_IN_SQRTL)
1 (define_operator_list POW BUILT_IN_POWF BUILT_IN_POW BUILT_IN_POWL)
1 (simplify
1 (SQRT (POW @0 @1))
1 (POW (abs @0) (mult @1 { built_real (TREE_TYPE (@1), dconsthalf); })))
1
1 is the same as
1
1 (for SQRT (BUILT_IN_SQRTF BUILT_IN_SQRT BUILT_IN_SQRTL)
1 POW (BUILT_IN_POWF BUILT_IN_POW BUILT_IN_POWL)
1 (simplify
1 (SQRT (POW @0 @1))
1 (POW (abs @0) (mult @1 { built_real (TREE_TYPE (@1), dconsthalf); }))))
1
1 'for's and operator lists can include the special identifier 'null'
1 that matches nothing and can never be generated. This can be used to
1 pad an operator list so that it has a standard form, even if there isn't
1 a suitable operator for every form.
1
1 Another building block are 'with' expressions in the result expression
1 which nest the generated code in a new C block followed by its argument:
1
1 (simplify
1 (convert (mult @0 @1))
1 (with { tree utype = unsigned_type_for (type); }
1 (convert (mult (convert:utype @0) (convert:utype @1)))))
1
1 This allows code nested in the 'with' to refer to the declared
1 variables. In the above case we use the feature to specify the type of
1 a generated expression with the ':type' syntax where 'type' needs to be
1 an identifier that refers to the desired type. Usually the types of the
1 generated result expressions are determined from the context, but
1 sometimes like in the above case it is required that you specify them
1 explicitely.
1
1 As intermediate conversions are often optional there is a way to avoid
1 the need to repeat patterns both with and without such conversions.
1 Namely you can mark a conversion as being optional with a '?':
1
1 (simplify
1 (eq (convert@0 @1) (convert? @2))
1 (eq @1 (convert @2)))
1
1 which will match both '(eq (convert @1) (convert @2))' and '(eq
1 (convert @1) @2)'. The optional converts are supposed to be all either
1 present or not, thus '(eq (convert? @1) (convert? @2))' will result in
1 two patterns only. If you want to match all four combinations you have
1 access to two additional conditional converts as in '(eq (convert1? @1)
1 (convert2? @2))'.
1
1 Predicates available from the GCC middle-end need to be made available
1 explicitely via 'define_predicates':
1
1 (define_predicates
1 integer_onep integer_zerop integer_all_onesp)
1
1 You can also define predicates using the pattern matching language and
1 the 'match' form:
1
1 (match negate_expr_p
1 INTEGER_CST
1 (if (TYPE_OVERFLOW_WRAPS (type)
1 || may_negate_without_overflow_p (t))))
1 (match negate_expr_p
1 (negate @0))
1
1 This shows that for 'match' expressions there is 't' available which
1 captures the outermost expression (something not possible in the
1 'simplify' context). As you can see 'match' has an identifier as first
1 operand which is how you refer to the predicate in patterns. Multiple
1 'match' for the same identifier add additional cases where the predicate
1 matches.
1
1 Predicates can also match an expression in which case you need to
1 provide a template specifying the identifier and where to get its
1 operands from:
1
1 (match (logical_inverted_value @0)
1 (eq @0 integer_zerop))
1 (match (logical_inverted_value @0)
1 (bit_not truth_valued_p@0))
1
1 You can use the above predicate like
1
1 (simplify
1 (bit_and @0 (logical_inverted_value @0))
1 { build_zero_cst (type); })
1
1 Which will match a bitwise and of an operand with its logical inverted
1 value.
1