gccint: Costs

1 
1 18.16 Describing Relative Costs of Operations
1 =============================================
1 
1 These macros let you describe the relative speed of various operations
1 on the target machine.
1 
1  -- Macro: REGISTER_MOVE_COST (MODE, FROM, TO)
1      A C expression for the cost of moving data of mode MODE from a
1      register in class FROM to one in class TO.  The classes are
1      expressed using the enumeration values such as 'GENERAL_REGS'.  A
1      value of 2 is the default; other values are interpreted relative to
1      that.
1 
1      It is not required that the cost always equal 2 when FROM is the
1      same as TO; on some machines it is expensive to move between
1      registers if they are not general registers.
1 
1      If reload sees an insn consisting of a single 'set' between two
1      hard registers, and if 'REGISTER_MOVE_COST' applied to their
1      classes returns a value of 2, reload does not check to ensure that
1      the constraints of the insn are met.  Setting a cost of other than
1      2 will allow reload to verify that the constraints are met.  You
1      should do this if the 'movM' pattern's constraints do not allow
1      such copying.
1 
1      These macros are obsolete, new ports should use the target hook
1      'TARGET_REGISTER_MOVE_COST' instead.
1 
1  -- Target Hook: int TARGET_REGISTER_MOVE_COST (machine_mode MODE,
1           reg_class_t FROM, reg_class_t TO)
1      This target hook should return the cost of moving data of mode MODE
1      from a register in class FROM to one in class TO.  The classes are
1      expressed using the enumeration values such as 'GENERAL_REGS'.  A
1      value of 2 is the default; other values are interpreted relative to
1      that.
1 
1      It is not required that the cost always equal 2 when FROM is the
1      same as TO; on some machines it is expensive to move between
1      registers if they are not general registers.
1 
1      If reload sees an insn consisting of a single 'set' between two
1      hard registers, and if 'TARGET_REGISTER_MOVE_COST' applied to their
1      classes returns a value of 2, reload does not check to ensure that
1      the constraints of the insn are met.  Setting a cost of other than
1      2 will allow reload to verify that the constraints are met.  You
1      should do this if the 'movM' pattern's constraints do not allow
1      such copying.
1 
1      The default version of this function returns 2.
1 
1  -- Macro: MEMORY_MOVE_COST (MODE, CLASS, IN)
1      A C expression for the cost of moving data of mode MODE between a
1      register of class CLASS and memory; IN is zero if the value is to
1      be written to memory, nonzero if it is to be read in.  This cost is
1      relative to those in 'REGISTER_MOVE_COST'.  If moving between
1      registers and memory is more expensive than between two registers,
1      you should define this macro to express the relative cost.
1 
1      If you do not define this macro, GCC uses a default cost of 4 plus
1      the cost of copying via a secondary reload register, if one is
1      needed.  If your machine requires a secondary reload register to
1      copy between memory and a register of CLASS but the reload
1      mechanism is more complex than copying via an intermediate, define
1      this macro to reflect the actual cost of the move.
1 
1      GCC defines the function 'memory_move_secondary_cost' if secondary
1      reloads are needed.  It computes the costs due to copying via a
1      secondary register.  If your machine copies from memory using a
1      secondary register in the conventional way but the default base
1      value of 4 is not correct for your machine, define this macro to
1      add some other value to the result of that function.  The arguments
1      to that function are the same as to this macro.
1 
1      These macros are obsolete, new ports should use the target hook
1      'TARGET_MEMORY_MOVE_COST' instead.
1 
1  -- Target Hook: int TARGET_MEMORY_MOVE_COST (machine_mode MODE,
1           reg_class_t RCLASS, bool IN)
1      This target hook should return the cost of moving data of mode MODE
1      between a register of class RCLASS and memory; IN is 'false' if the
1      value is to be written to memory, 'true' if it is to be read in.
1      This cost is relative to those in 'TARGET_REGISTER_MOVE_COST'.  If
1      moving between registers and memory is more expensive than between
1      two registers, you should add this target hook to express the
1      relative cost.
1 
1      If you do not add this target hook, GCC uses a default cost of 4
1      plus the cost of copying via a secondary reload register, if one is
1      needed.  If your machine requires a secondary reload register to
1      copy between memory and a register of RCLASS but the reload
1      mechanism is more complex than copying via an intermediate, use
1      this target hook to reflect the actual cost of the move.
1 
1      GCC defines the function 'memory_move_secondary_cost' if secondary
1      reloads are needed.  It computes the costs due to copying via a
1      secondary register.  If your machine copies from memory using a
1      secondary register in the conventional way but the default base
1      value of 4 is not correct for your machine, use this target hook to
1      add some other value to the result of that function.  The arguments
1      to that function are the same as to this target hook.
1 
1  -- Macro: BRANCH_COST (SPEED_P, PREDICTABLE_P)
1      A C expression for the cost of a branch instruction.  A value of 1
1      is the default; other values are interpreted relative to that.
1      Parameter SPEED_P is true when the branch in question should be
1      optimized for speed.  When it is false, 'BRANCH_COST' should return
1      a value optimal for code size rather than performance.
1      PREDICTABLE_P is true for well-predicted branches.  On many
1      architectures the 'BRANCH_COST' can be reduced then.
1 
1  Here are additional macros which do not specify precise relative costs,
1 but only that certain actions are more expensive than GCC would
1 ordinarily expect.
1 
1  -- Macro: SLOW_BYTE_ACCESS
1      Define this macro as a C expression which is nonzero if accessing
1      less than a word of memory (i.e. a 'char' or a 'short') is no
1      faster than accessing a word of memory, i.e., if such access
1      require more than one instruction or if there is no difference in
1      cost between byte and (aligned) word loads.
1 
1      When this macro is not defined, the compiler will access a field by
1      finding the smallest containing object; when it is defined, a
1      fullword load will be used if alignment permits.  Unless bytes
1      accesses are faster than word accesses, using word accesses is
1      preferable since it may eliminate subsequent memory access if
1      subsequent accesses occur to other fields in the same word of the
1      structure, but to different bytes.
1 
1  -- Target Hook: bool TARGET_SLOW_UNALIGNED_ACCESS (machine_mode MODE,
1           unsigned int ALIGN)
1      This hook returns true if memory accesses described by the MODE and
1      ALIGNMENT parameters have a cost many times greater than aligned
1      accesses, for example if they are emulated in a trap handler.  This
1      hook is invoked only for unaligned accesses, i.e.  when 'ALIGNMENT
1      < GET_MODE_ALIGNMENT (MODE)'.
1 
1      When this hook returns true, the compiler will act as if
1      'STRICT_ALIGNMENT' were true when generating code for block moves.
1      This can cause significantly more instructions to be produced.
1      Therefore, do not make this hook return true if unaligned accesses
1      only add a cycle or two to the time for a memory access.
1 
1      The hook must return true whenever 'STRICT_ALIGNMENT' is true.  The
1      default implementation returns 'STRICT_ALIGNMENT'.
1 
1  -- Macro: MOVE_RATIO (SPEED)
1      The threshold of number of scalar memory-to-memory move insns,
1      _below_ which a sequence of insns should be generated instead of a
1      string move insn or a library call.  Increasing the value will
1      always make code faster, but eventually incurs high cost in
1      increased code size.
1 
1      Note that on machines where the corresponding move insn is a
1      'define_expand' that emits a sequence of insns, this macro counts
1      the number of such sequences.
1 
1      The parameter SPEED is true if the code is currently being
1      optimized for speed rather than size.
1 
1      If you don't define this, a reasonable default is used.
1 
1  -- Target Hook: bool TARGET_USE_BY_PIECES_INFRASTRUCTURE_P (unsigned
1           HOST_WIDE_INT SIZE, unsigned int ALIGNMENT, enum
1           by_pieces_operation OP, bool SPEED_P)
1      GCC will attempt several strategies when asked to copy between two
1      areas of memory, or to set, clear or store to memory, for example
1      when copying a 'struct'.  The 'by_pieces' infrastructure implements
1      such memory operations as a sequence of load, store or move insns.
1      Alternate strategies are to expand the 'movmem' or 'setmem' optabs,
1      to emit a library call, or to emit unit-by-unit, loop-based
1      operations.
1 
1      This target hook should return true if, for a memory operation with
1      a given SIZE and ALIGNMENT, using the 'by_pieces' infrastructure is
1      expected to result in better code generation.  Both SIZE and
1      ALIGNMENT are measured in terms of storage units.
1 
1      The parameter OP is one of: 'CLEAR_BY_PIECES', 'MOVE_BY_PIECES',
1      'SET_BY_PIECES', 'STORE_BY_PIECES' or 'COMPARE_BY_PIECES'.  These
1      describe the type of memory operation under consideration.
1 
1      The parameter SPEED_P is true if the code is currently being
1      optimized for speed rather than size.
1 
1      Returning true for higher values of SIZE can improve code
1      generation for speed if the target does not provide an
1      implementation of the 'movmem' or 'setmem' standard names, if the
1      'movmem' or 'setmem' implementation would be more expensive than a
1      sequence of insns, or if the overhead of a library call would
1      dominate that of the body of the memory operation.
1 
1      Returning true for higher values of 'size' may also cause an
1      increase in code size, for example where the number of insns
1      emitted to perform a move would be greater than that of a library
1      call.
1 
1  -- Target Hook: int TARGET_COMPARE_BY_PIECES_BRANCH_RATIO (machine_mode
1           MODE)
1      When expanding a block comparison in MODE, gcc can try to reduce
1      the number of branches at the expense of more memory operations.
1      This hook allows the target to override the default choice.  It
1      should return the factor by which branches should be reduced over
1      the plain expansion with one comparison per MODE-sized piece.  A
1      port can also prevent a particular mode from being used for block
1      comparisons by returning a negative number from this hook.
1 
1  -- Macro: MOVE_MAX_PIECES
1      A C expression used by 'move_by_pieces' to determine the largest
1      unit a load or store used to copy memory is.  Defaults to
1      'MOVE_MAX'.
1 
1  -- Macro: STORE_MAX_PIECES
1      A C expression used by 'store_by_pieces' to determine the largest
1      unit a store used to memory is.  Defaults to 'MOVE_MAX_PIECES', or
1      two times the size of 'HOST_WIDE_INT', whichever is smaller.
1 
1  -- Macro: COMPARE_MAX_PIECES
1      A C expression used by 'compare_by_pieces' to determine the largest
1      unit a load or store used to compare memory is.  Defaults to
1      'MOVE_MAX_PIECES'.
1 
1  -- Macro: CLEAR_RATIO (SPEED)
1      The threshold of number of scalar move insns, _below_ which a
1      sequence of insns should be generated to clear memory instead of a
1      string clear insn or a library call.  Increasing the value will
1      always make code faster, but eventually incurs high cost in
1      increased code size.
1 
1      The parameter SPEED is true if the code is currently being
1      optimized for speed rather than size.
1 
1      If you don't define this, a reasonable default is used.
1 
1  -- Macro: SET_RATIO (SPEED)
1      The threshold of number of scalar move insns, _below_ which a
1      sequence of insns should be generated to set memory to a constant
1      value, instead of a block set insn or a library call.  Increasing
1      the value will always make code faster, but eventually incurs high
1      cost in increased code size.
1 
1      The parameter SPEED is true if the code is currently being
1      optimized for speed rather than size.
1 
1      If you don't define this, it defaults to the value of 'MOVE_RATIO'.
1 
1  -- Macro: USE_LOAD_POST_INCREMENT (MODE)
1      A C expression used to determine whether a load postincrement is a
1      good thing to use for a given mode.  Defaults to the value of
1      'HAVE_POST_INCREMENT'.
1 
1  -- Macro: USE_LOAD_POST_DECREMENT (MODE)
1      A C expression used to determine whether a load postdecrement is a
1      good thing to use for a given mode.  Defaults to the value of
1      'HAVE_POST_DECREMENT'.
1 
1  -- Macro: USE_LOAD_PRE_INCREMENT (MODE)
1      A C expression used to determine whether a load preincrement is a
1      good thing to use for a given mode.  Defaults to the value of
1      'HAVE_PRE_INCREMENT'.
1 
1  -- Macro: USE_LOAD_PRE_DECREMENT (MODE)
1      A C expression used to determine whether a load predecrement is a
1      good thing to use for a given mode.  Defaults to the value of
1      'HAVE_PRE_DECREMENT'.
1 
1  -- Macro: USE_STORE_POST_INCREMENT (MODE)
1      A C expression used to determine whether a store postincrement is a
1      good thing to use for a given mode.  Defaults to the value of
1      'HAVE_POST_INCREMENT'.
1 
1  -- Macro: USE_STORE_POST_DECREMENT (MODE)
1      A C expression used to determine whether a store postdecrement is a
1      good thing to use for a given mode.  Defaults to the value of
1      'HAVE_POST_DECREMENT'.
1 
1  -- Macro: USE_STORE_PRE_INCREMENT (MODE)
1      This macro is used to determine whether a store preincrement is a
1      good thing to use for a given mode.  Defaults to the value of
1      'HAVE_PRE_INCREMENT'.
1 
1  -- Macro: USE_STORE_PRE_DECREMENT (MODE)
1      This macro is used to determine whether a store predecrement is a
1      good thing to use for a given mode.  Defaults to the value of
1      'HAVE_PRE_DECREMENT'.
1 
1  -- Macro: NO_FUNCTION_CSE
1      Define this macro to be true if it is as good or better to call a
1      constant function address than to call an address kept in a
1      register.
1 
1  -- Macro: LOGICAL_OP_NON_SHORT_CIRCUIT
1      Define this macro if a non-short-circuit operation produced by
1      'fold_range_test ()' is optimal.  This macro defaults to true if
1      'BRANCH_COST' is greater than or equal to the value 2.
1 
1  -- Target Hook: bool TARGET_OPTAB_SUPPORTED_P (int OP, machine_mode
1           MODE1, machine_mode MODE2, optimization_type OPT_TYPE)
1      Return true if the optimizers should use optab OP with modes MODE1
1      and MODE2 for optimization type OPT_TYPE.  The optab is known to
1      have an associated '.md' instruction whose C condition is true.
1      MODE2 is only meaningful for conversion optabs; for direct optabs
1      it is a copy of MODE1.
1 
1      For example, when called with OP equal to 'rint_optab' and MODE1
1      equal to 'DFmode', the hook should say whether the optimizers
1      should use optab 'rintdf2'.
1 
1      The default hook returns true for all inputs.
1 
1  -- Target Hook: bool TARGET_RTX_COSTS (rtx X, machine_mode MODE, int
1           OUTER_CODE, int OPNO, int *TOTAL, bool SPEED)
1      This target hook describes the relative costs of RTL expressions.
1 
1      The cost may depend on the precise form of the expression, which is
1      available for examination in X, and the fact that X appears as
1      operand OPNO of an expression with rtx code OUTER_CODE.  That is,
1      the hook can assume that there is some rtx Y such that 'GET_CODE
1      (Y) == OUTER_CODE' and such that either (a) 'XEXP (Y, OPNO) == X'
1      or (b) 'XVEC (Y, OPNO)' contains X.
1 
1      MODE is X's machine mode, or for cases like 'const_int' that do not
1      have a mode, the mode in which X is used.
1 
1      In implementing this hook, you can use the construct 'COSTS_N_INSNS
1      (N)' to specify a cost equal to N fast instructions.
1 
1      On entry to the hook, '*TOTAL' contains a default estimate for the
1      cost of the expression.  The hook should modify this value as
1      necessary.  Traditionally, the default costs are 'COSTS_N_INSNS
1      (5)' for multiplications, 'COSTS_N_INSNS (7)' for division and
1      modulus operations, and 'COSTS_N_INSNS (1)' for all other
1      operations.
1 
1      When optimizing for code size, i.e. when 'speed' is false, this
1      target hook should be used to estimate the relative size cost of an
1      expression, again relative to 'COSTS_N_INSNS'.
1 
1      The hook returns true when all subexpressions of X have been
1      processed, and false when 'rtx_cost' should recurse.
1 
1  -- Target Hook: int TARGET_ADDRESS_COST (rtx ADDRESS, machine_mode
1           MODE, addr_space_t AS, bool SPEED)
1      This hook computes the cost of an addressing mode that contains
1      ADDRESS.  If not defined, the cost is computed from the ADDRESS
1      expression and the 'TARGET_RTX_COST' hook.
1 
1      For most CISC machines, the default cost is a good approximation of
1      the true cost of the addressing mode.  However, on RISC machines,
1      all instructions normally have the same length and execution time.
1      Hence all addresses will have equal costs.
1 
1      In cases where more than one form of an address is known, the form
1      with the lowest cost will be used.  If multiple forms have the
1      same, lowest, cost, the one that is the most complex will be used.
1 
1      For example, suppose an address that is equal to the sum of a
1      register and a constant is used twice in the same basic block.
1      When this macro is not defined, the address will be computed in a
1      register and memory references will be indirect through that
1      register.  On machines where the cost of the addressing mode
1      containing the sum is no higher than that of a simple indirect
1      reference, this will produce an additional instruction and possibly
1      require an additional register.  Proper specification of this macro
1      eliminates this overhead for such machines.
1 
1      This hook is never called with an invalid address.
1 
1      On machines where an address involving more than one register is as
1      cheap as an address computation involving only one register,
1      defining 'TARGET_ADDRESS_COST' to reflect this can cause two
1      registers to be live over a region of code where only one would
1      have been if 'TARGET_ADDRESS_COST' were not defined in that manner.
1      This effect should be considered in the definition of this macro.
1      Equivalent costs should probably only be given to addresses with
1      different numbers of registers on machines with lots of registers.
1 
1  -- Target Hook: int TARGET_INSN_COST (rtx_insn *INSN, bool SPEED)
1      This target hook describes the relative costs of RTL instructions.
1 
1      In implementing this hook, you can use the construct 'COSTS_N_INSNS
1      (N)' to specify a cost equal to N fast instructions.
1 
1      When optimizing for code size, i.e. when 'speed' is false, this
1      target hook should be used to estimate the relative size cost of an
1      expression, again relative to 'COSTS_N_INSNS'.
1 
1  -- Target Hook: unsigned int TARGET_MAX_NOCE_IFCVT_SEQ_COST (edge E)
1      This hook returns a value in the same units as 'TARGET_RTX_COSTS',
1      giving the maximum acceptable cost for a sequence generated by the
1      RTL if-conversion pass when conditional execution is not available.
1      The RTL if-conversion pass attempts to convert conditional
1      operations that would require a branch to a series of unconditional
1      operations and 'movMODEcc' insns.  This hook returns the maximum
1      cost of the unconditional instructions and the 'movMODEcc' insns.
1      RTL if-conversion is cancelled if the cost of the converted
1      sequence is greater than the value returned by this hook.
1 
1      'e' is the edge between the basic block containing the conditional
1      branch to the basic block which would be executed if the condition
1      were true.
1 
1      The default implementation of this hook uses the
1      'max-rtl-if-conversion-[un]predictable' parameters if they are set,
1      and uses a multiple of 'BRANCH_COST' otherwise.
1 
1  -- Target Hook: bool TARGET_NOCE_CONVERSION_PROFITABLE_P (rtx_insn
1           *SEQ, struct noce_if_info *IF_INFO)
1      This hook returns true if the instruction sequence 'seq' is a good
1      candidate as a replacement for the if-convertible sequence
1      described in 'if_info'.
1 
1  -- Target Hook: bool TARGET_NO_SPECULATION_IN_DELAY_SLOTS_P (void)
1      This predicate controls the use of the eager delay slot filler to
1      disallow speculatively executed instructions being placed in delay
1      slots.  Targets such as certain MIPS architectures possess both
1      branches with and without delay slots.  As the eager delay slot
1      filler can decrease performance, disabling it is beneficial when
1      ordinary branches are available.  Use of delay slot branches filled
1      using the basic filler is often still desirable as the delay slot
1      can hide a pipeline bubble.
1 
1  -- Target Hook: HOST_WIDE_INT TARGET_ESTIMATED_POLY_VALUE (poly_int64
1           VAL)
1      Return an estimate of the runtime value of VAL, for use in things
1      like cost calculations or profiling frequencies.  The default
1      implementation returns the lowest possible value of VAL.
1