gccint: The Language

1 
1 26.2 The Language
1 =================
1 
1 The language to write expression simplifications in resembles other
1 domain-specific languages GCC uses.  Thus it is lispy.  Lets start with
1 an example from the match.pd file:
1 
1      (simplify
1        (bit_and @0 integer_all_onesp)
1        @0)
1 
1  This example contains all required parts of an expression
1 simplification.  A simplification is wrapped inside a '(simplify ...)'
1 expression.  That contains at least two operands - an expression that is
1 matched with the GIMPLE or GENERIC IL and a replacement expression that
1 is returned if the match was successful.
1 
1  Expressions have an operator ID, 'bit_and' in this case.  Expressions
1 can be lower-case tree codes with '_expr' stripped off or builtin
1 function code names in all-caps, like 'BUILT_IN_SQRT'.
1 
1  '@n' denotes a so-called capture.  It captures the operand and lets you
1 refer to it in other places of the match-and-simplify.  In the above
1 example it is refered to in the replacement expression.  Captures are
1 '@' followed by a number or an identifier.
1 
1      (simplify
1        (bit_xor @0 @0)
1        { build_zero_cst (type); })
1 
1  In this example '@0' is mentioned twice which constrains the matched
1 expression to have two equal operands.  Usually matches are constraint
1 to equal types.  If operands may be constants and conversions are
1 involved matching by value might be preferred in which case use '@@0' to
1 denote a by value match and the specific operand you want to refer to in
1 the result part.  This example also introduces operands written in C
1 code.  These can be used in the expression replacements and are supposed
1 to evaluate to a tree node which has to be a valid GIMPLE operand (so
1 you cannot generate expressions in C code).
1 
1      (simplify
1        (trunc_mod integer_zerop@0 @1)
1        (if (!integer_zerop (@1))
1         @0))
1 
1  Here '@0' captures the first operand of the trunc_mod expression which
1 is also predicated with 'integer_zerop'.  Expression operands may be
1 either expressions, predicates or captures.  Captures can be
1 unconstrained or capture expresions or predicates.
1 
1  This example introduces an optional operand of simplify, the
1 if-expression.  This condition is evaluated after the expression matched
1 in the IL and is required to evaluate to true to enable the replacement
1 expression in the second operand position.  The expression operand of
1 the 'if' is a standard C expression which may contain references to
1 captures.  The 'if' has an optional third operand which may contain the
1 replacement expression that is enabled when the condition evaluates to
1 false.
1 
1  A 'if' expression can be used to specify a common condition for
1 multiple simplify patterns, avoiding the need to repeat that multiple
1 times:
1 
1      (if (!TYPE_SATURATING (type)
1           && !FLOAT_TYPE_P (type) && !FIXED_POINT_TYPE_P (type))
1        (simplify
1          (minus (plus @0 @1) @0)
1          @1)
1        (simplify
1          (minus (minus @0 @1) @0)
1          (negate @1)))
1 
1  Note that 'if's in outer position do not have the optional else clause
1 but instead have multiple then clauses.
1 
1  Ifs can be nested.
1 
1  There exists a 'switch' expression which can be used to chain
1 conditions avoiding nesting 'if's too much:
1 
1      (simplify
1       (simple_comparison @0 REAL_CST@1)
1       (switch
1        /* a CMP (-0) -> a CMP 0  */
1        (if (REAL_VALUE_MINUS_ZERO (TREE_REAL_CST (@1)))
1         (cmp @0 { build_real (TREE_TYPE (@1), dconst0); }))
1        /* x != NaN is always true, other ops are always false.  */
1        (if (REAL_VALUE_ISNAN (TREE_REAL_CST (@1))
1             && ! HONOR_SNANS (@1))
1         { constant_boolean_node (cmp == NE_EXPR, type); })))
1 
1  Is equal to
1 
1      (simplify
1       (simple_comparison @0 REAL_CST@1)
1       (switch
1        /* a CMP (-0) -> a CMP 0  */
1        (if (REAL_VALUE_MINUS_ZERO (TREE_REAL_CST (@1)))
1         (cmp @0 { build_real (TREE_TYPE (@1), dconst0); })
1         /* x != NaN is always true, other ops are always false.  */
1         (if (REAL_VALUE_ISNAN (TREE_REAL_CST (@1))
1              && ! HONOR_SNANS (@1))
1          { constant_boolean_node (cmp == NE_EXPR, type); }))))
1 
1  which has the second 'if' in the else operand of the first.  The
1 'switch' expression takes 'if' expressions as operands (which may not
1 have else clauses) and as a last operand a replacement expression which
1 should be enabled by default if no other condition evaluated to true.
1 
1  Captures can also be used for capturing results of sub-expressions.
1 
1      #if GIMPLE
1      (simplify
1        (pointer_plus (addr@2 @0) INTEGER_CST_P@1)
1        (if (is_gimple_min_invariant (@2)))
1        {
1          poly_int64 off;
1          tree base = get_addr_base_and_unit_offset (@0, &off);
1          off += tree_to_uhwi (@1);
1          /* Now with that we should be able to simply write
1             (addr (mem_ref (addr @base) (plus @off @1)))  */
1          build1 (ADDR_EXPR, type,
1                  build2 (MEM_REF, TREE_TYPE (TREE_TYPE (@2)),
1                          build_fold_addr_expr (base),
1                          build_int_cst (ptr_type_node, off)));
1        })
1      #endif
1 
1  In the above example, '@2' captures the result of the expression '(addr
1 @0)'.  For outermost expression only its type can be captured, and the
1 keyword 'type' is reserved for this purpose.  The above example also
1 gives a way to conditionalize patterns to only apply to 'GIMPLE' or
1 'GENERIC' by means of using the pre-defined preprocessor macros 'GIMPLE'
1 and 'GENERIC' and using preprocessor directives.
1 
1      (simplify
1        (bit_and:c integral_op_p@0 (bit_ior:c (bit_not @0) @1))
1        (bit_and @1 @0))
1 
1  Here we introduce flags on match expressions.  The flag used above,
1 'c', denotes that the expression should be also matched commutated.
1 Thus the above match expression is really the following four match
1 expressions:
1 
1        (bit_and integral_op_p@0 (bit_ior (bit_not @0) @1))
1        (bit_and (bit_ior (bit_not @0) @1) integral_op_p@0)
1        (bit_and integral_op_p@0 (bit_ior @1 (bit_not @0)))
1        (bit_and (bit_ior @1 (bit_not @0)) integral_op_p@0)
1 
1  Usual canonicalizations you know from GENERIC expressions are applied
1 before matching, so for example constant operands always come second in
1 commutative expressions.
1 
1  The second supported flag is 's' which tells the code generator to fail
1 the pattern if the expression marked with 's' does have more than one
1 use.  For example in
1 
1      (simplify
1        (pointer_plus (pointer_plus:s @0 @1) @3)
1        (pointer_plus @0 (plus @1 @3)))
1 
1  this avoids the association if '(pointer_plus @0 @1)' is used outside
1 of the matched expression and thus it would stay live and not trivially
1 removed by dead code elimination.
1 
1  More features exist to avoid too much repetition.
1 
1      (for op (plus pointer_plus minus bit_ior bit_xor)
1        (simplify
1          (op @0 integer_zerop)
1          @0))
1 
1  A 'for' expression can be used to repeat a pattern for each operator
1 specified, substituting 'op'.  'for' can be nested and a 'for' can have
1 multiple operators to iterate.
1 
1      (for opa (plus minus)
1           opb (minus plus)
1        (for opc (plus minus)
1          (simplify...
1 
1  In this example the pattern will be repeated four times with 'opa, opb,
1 opc' being 'plus, minus, plus', 'plus, minus, minus', 'minus, plus,
1 plus', 'minus, plus, minus'.
1 
1  To avoid repeating operator lists in 'for' you can name them via
1 
1      (define_operator_list pmm plus minus mult)
1 
1  and use them in 'for' operator lists where they get expanded.
1 
1      (for opa (pmm trunc_div)
1       (simplify...
1 
1  So this example iterates over 'plus', 'minus', 'mult' and 'trunc_div'.
1 
1  Using operator lists can also remove the need to explicitely write a
1 'for'.  All operator list uses that appear in a 'simplify' or 'match'
1 pattern in operator positions will implicitely be added to a new 'for'.
1 For example
1 
1      (define_operator_list SQRT BUILT_IN_SQRTF BUILT_IN_SQRT BUILT_IN_SQRTL)
1      (define_operator_list POW BUILT_IN_POWF BUILT_IN_POW BUILT_IN_POWL)
1      (simplify
1       (SQRT (POW @0 @1))
1       (POW (abs @0) (mult @1 { built_real (TREE_TYPE (@1), dconsthalf); })))
1 
1  is the same as
1 
1      (for SQRT (BUILT_IN_SQRTF BUILT_IN_SQRT BUILT_IN_SQRTL)
1           POW (BUILT_IN_POWF BUILT_IN_POW BUILT_IN_POWL)
1       (simplify
1        (SQRT (POW @0 @1))
1        (POW (abs @0) (mult @1 { built_real (TREE_TYPE (@1), dconsthalf); }))))
1 
1  'for's and operator lists can include the special identifier 'null'
1 that matches nothing and can never be generated.  This can be used to
1 pad an operator list so that it has a standard form, even if there isn't
1 a suitable operator for every form.
1 
1  Another building block are 'with' expressions in the result expression
1 which nest the generated code in a new C block followed by its argument:
1 
1      (simplify
1       (convert (mult @0 @1))
1       (with { tree utype = unsigned_type_for (type); }
1        (convert (mult (convert:utype @0) (convert:utype @1)))))
1 
1  This allows code nested in the 'with' to refer to the declared
1 variables.  In the above case we use the feature to specify the type of
1 a generated expression with the ':type' syntax where 'type' needs to be
1 an identifier that refers to the desired type.  Usually the types of the
1 generated result expressions are determined from the context, but
1 sometimes like in the above case it is required that you specify them
1 explicitely.
1 
1  As intermediate conversions are often optional there is a way to avoid
1 the need to repeat patterns both with and without such conversions.
1 Namely you can mark a conversion as being optional with a '?':
1 
1      (simplify
1       (eq (convert@0 @1) (convert? @2))
1       (eq @1 (convert @2)))
1 
1  which will match both '(eq (convert @1) (convert @2))' and '(eq
1 (convert @1) @2)'.  The optional converts are supposed to be all either
1 present or not, thus '(eq (convert? @1) (convert? @2))' will result in
1 two patterns only.  If you want to match all four combinations you have
1 access to two additional conditional converts as in '(eq (convert1? @1)
1 (convert2? @2))'.
1 
1  Predicates available from the GCC middle-end need to be made available
1 explicitely via 'define_predicates':
1 
1      (define_predicates
1       integer_onep integer_zerop integer_all_onesp)
1 
1  You can also define predicates using the pattern matching language and
1 the 'match' form:
1 
1      (match negate_expr_p
1       INTEGER_CST
1       (if (TYPE_OVERFLOW_WRAPS (type)
1            || may_negate_without_overflow_p (t))))
1      (match negate_expr_p
1       (negate @0))
1 
1  This shows that for 'match' expressions there is 't' available which
1 captures the outermost expression (something not possible in the
1 'simplify' context).  As you can see 'match' has an identifier as first
1 operand which is how you refer to the predicate in patterns.  Multiple
1 'match' for the same identifier add additional cases where the predicate
1 matches.
1 
1  Predicates can also match an expression in which case you need to
1 provide a template specifying the identifier and where to get its
1 operands from:
1 
1      (match (logical_inverted_value @0)
1       (eq @0 integer_zerop))
1      (match (logical_inverted_value @0)
1       (bit_not truth_valued_p@0))
1 
1  You can use the above predicate like
1 
1      (simplify
1       (bit_and @0 (logical_inverted_value @0))
1       { build_zero_cst (type); })
1 
1  Which will match a bitwise and of an operand with its logical inverted
1 value.
1