gcc: Optimize Options

1 
1 3.10 Options That Control Optimization
1 ======================================
1 
1 These options control various sorts of optimizations.
1 
1  Without any optimization option, the compiler's goal is to reduce the
1 cost of compilation and to make debugging produce the expected results.
1 Statements are independent: if you stop the program with a breakpoint
1 between statements, you can then assign a new value to any variable or
1 change the program counter to any other statement in the function and
1 get exactly the results you expect from the source code.
1 
1  Turning on optimization flags makes the compiler attempt to improve the
1 performance and/or code size at the expense of compilation time and
1 possibly the ability to debug the program.
1 
1  The compiler performs optimization based on the knowledge it has of the
1 program.  Compiling multiple files at once to a single output file mode
1 allows the compiler to use information gained from all of the files when
1 compiling each of them.
1 
1  Not all optimizations are controlled directly by a flag.  Only
1 optimizations that have a flag are listed in this section.
1 
1  Most optimizations are only enabled if an '-O' level is set on the
1 command line.  Otherwise they are disabled, even if individual
1 optimization flags are specified.
1 
1  Depending on the target and how GCC was configured, a slightly
1 different set of optimizations may be enabled at each '-O' level than
1 those listed here.  You can invoke GCC with '-Q --help=optimizers' to
1 find out the exact set of optimizations that are enabled at each level.
1 ⇒Overall Options, for examples.
1 
1 '-O'
1 '-O1'
1      Optimize.  Optimizing compilation takes somewhat more time, and a
1      lot more memory for a large function.
1 
1      With '-O', the compiler tries to reduce code size and execution
1      time, without performing any optimizations that take a great deal
1      of compilation time.
1 
1      '-O' turns on the following optimization flags:
1           -fauto-inc-dec
1           -fbranch-count-reg
1           -fcombine-stack-adjustments
1           -fcompare-elim
1           -fcprop-registers
1           -fdce
1           -fdefer-pop
1           -fdelayed-branch
1           -fdse
1           -fforward-propagate
1           -fguess-branch-probability
1           -fif-conversion2
1           -fif-conversion
1           -finline-functions-called-once
1           -fipa-pure-const
1           -fipa-profile
1           -fipa-reference
1           -fmerge-constants
1           -fmove-loop-invariants
1           -fomit-frame-pointer
1           -freorder-blocks
1           -fshrink-wrap
1           -fshrink-wrap-separate
1           -fsplit-wide-types
1           -fssa-backprop
1           -fssa-phiopt
1           -ftree-bit-ccp
1           -ftree-ccp
1           -ftree-ch
1           -ftree-coalesce-vars
1           -ftree-copy-prop
1           -ftree-dce
1           -ftree-dominator-opts
1           -ftree-dse
1           -ftree-forwprop
1           -ftree-fre
1           -ftree-phiprop
1           -ftree-sink
1           -ftree-slsr
1           -ftree-sra
1           -ftree-pta
1           -ftree-ter
1           -funit-at-a-time
1 
1 '-O2'
1      Optimize even more.  GCC performs nearly all supported
1      optimizations that do not involve a space-speed tradeoff.  As
1      compared to '-O', this option increases both compilation time and
1      the performance of the generated code.
1 
1      '-O2' turns on all optimization flags specified by '-O'.  It also
1      turns on the following optimization flags:
1           -fthread-jumps
1           -falign-functions  -falign-jumps
1           -falign-loops  -falign-labels
1           -fcaller-saves
1           -fcrossjumping
1           -fcse-follow-jumps  -fcse-skip-blocks
1           -fdelete-null-pointer-checks
1           -fdevirtualize -fdevirtualize-speculatively
1           -fexpensive-optimizations
1           -fgcse  -fgcse-lm
1           -fhoist-adjacent-loads
1           -finline-small-functions
1           -findirect-inlining
1           -fipa-cp
1           -fipa-bit-cp
1           -fipa-vrp
1           -fipa-sra
1           -fipa-icf
1           -fisolate-erroneous-paths-dereference
1           -flra-remat
1           -foptimize-sibling-calls
1           -foptimize-strlen
1           -fpartial-inlining
1           -fpeephole2
1           -freorder-blocks-algorithm=stc
1           -freorder-blocks-and-partition -freorder-functions
1           -frerun-cse-after-loop
1           -fsched-interblock  -fsched-spec
1           -fschedule-insns  -fschedule-insns2
1           -fstore-merging
1           -fstrict-aliasing
1           -ftree-builtin-call-dce
1           -ftree-switch-conversion -ftree-tail-merge
1           -fcode-hoisting
1           -ftree-pre
1           -ftree-vrp
1           -fipa-ra
1 
1      Please note the warning under '-fgcse' about invoking '-O2' on
1      programs that use computed gotos.
1 
1 '-O3'
1      Optimize yet more.  '-O3' turns on all optimizations specified by
1      '-O2' and also turns on the following optimization flags:
1           -finline-functions
1           -funswitch-loops
1           -fpredictive-commoning
1           -fgcse-after-reload
1           -ftree-loop-vectorize
1           -ftree-loop-distribution
1           -ftree-loop-distribute-patterns
1           -floop-interchange
1           -floop-unroll-and-jam
1           -fsplit-paths
1           -ftree-slp-vectorize
1           -fvect-cost-model
1           -ftree-partial-pre
1           -fpeel-loops
1           -fipa-cp-clone
1 
1 '-O0'
1      Reduce compilation time and make debugging produce the expected
1      results.  This is the default.
1 
1 '-Os'
1      Optimize for size.  '-Os' enables all '-O2' optimizations that do
1      not typically increase code size.
1 
1      '-Os' disables the following optimization flags:
1           -falign-functions  -falign-jumps  -falign-loops
1           -falign-labels  -fprefetch-loop-arrays
1 
1      It also enables '-finline-functions', causes the compiler to tune
1      for code size rather than execution speed, and performs further
1      optimizations designed to reduce code size.
1 
1 '-Ofast'
1      Disregard strict standards compliance.  '-Ofast' enables all '-O3'
1      optimizations.  It also enables optimizations that are not valid
1      for all standard-compliant programs.  It turns on '-ffast-math' and
1      the Fortran-specific '-fstack-arrays', unless
1      '-fmax-stack-var-size' is specified, and '-fno-protect-parens'.
1 
1 '-Og'
1      Optimize debugging experience.  '-Og' enables optimizations that do
1      not interfere with debugging.  It should be the optimization level
1      of choice for the standard edit-compile-debug cycle, offering a
1      reasonable level of optimization while maintaining fast compilation
1      and a good debugging experience.
1 
1  If you use multiple '-O' options, with or without level numbers, the
1 last such option is the one that is effective.
1 
1  Options of the form '-fFLAG' specify machine-independent flags.  Most
1 flags have both positive and negative forms; the negative form of
1 '-ffoo' is '-fno-foo'.  In the table below, only one of the forms is
1 listed--the one you typically use.  You can figure out the other form by
1 either removing 'no-' or adding it.
1 
1  The following options control specific optimizations.  They are either
1 activated by '-O' options or are related to ones that are.  You can use
1 the following flags in the rare cases when "fine-tuning" of
1 optimizations to be performed is desired.
1 
1 '-fno-defer-pop'
1      Always pop the arguments to each function call as soon as that
1      function returns.  For machines that must pop arguments after a
1      function call, the compiler normally lets arguments accumulate on
1      the stack for several function calls and pops them all at once.
1 
1      Disabled at levels '-O', '-O2', '-O3', '-Os'.
1 
1 '-fforward-propagate'
1      Perform a forward propagation pass on RTL.  The pass tries to
1      combine two instructions and checks if the result can be
1      simplified.  If loop unrolling is active, two passes are performed
1      and the second is scheduled after loop unrolling.
1 
1      This option is enabled by default at optimization levels '-O',
1      '-O2', '-O3', '-Os'.
1 
1 '-ffp-contract=STYLE'
1      '-ffp-contract=off' disables floating-point expression contraction.
1      '-ffp-contract=fast' enables floating-point expression contraction
1      such as forming of fused multiply-add operations if the target has
1      native support for them.  '-ffp-contract=on' enables floating-point
1      expression contraction if allowed by the language standard.  This
1      is currently not implemented and treated equal to
1      '-ffp-contract=off'.
1 
1      The default is '-ffp-contract=fast'.
1 
1 '-fomit-frame-pointer'
1      Omit the frame pointer in functions that don't need one.  This
1      avoids the instructions to save, set up and restore the frame
1      pointer; on many targets it also makes an extra register available.
1 
1      On some targets this flag has no effect because the standard
1      calling sequence always uses a frame pointer, so it cannot be
1      omitted.
1 
1      Note that '-fno-omit-frame-pointer' doesn't guarantee the frame
1      pointer is used in all functions.  Several targets always omit the
1      frame pointer in leaf functions.
1 
1      Enabled by default at '-O' and higher.
1 
1 '-foptimize-sibling-calls'
1      Optimize sibling and tail recursive calls.
1 
1      Enabled at levels '-O2', '-O3', '-Os'.
1 
1 '-foptimize-strlen'
1      Optimize various standard C string functions (e.g.  'strlen',
1      'strchr' or 'strcpy') and their '_FORTIFY_SOURCE' counterparts into
1      faster alternatives.
1 
1      Enabled at levels '-O2', '-O3'.
1 
1 '-fno-inline'
1      Do not expand any functions inline apart from those marked with the
1      'always_inline' attribute.  This is the default when not
1      optimizing.
1 
1      Single functions can be exempted from inlining by marking them with
1      the 'noinline' attribute.
1 
1 '-finline-small-functions'
1      Integrate functions into their callers when their body is smaller
1      than expected function call code (so overall size of program gets
1      smaller).  The compiler heuristically decides which functions are
1      simple enough to be worth integrating in this way.  This inlining
1      applies to all functions, even those not declared inline.
1 
1      Enabled at levels '-O2', '-O3', '-Os'.
1 
1 '-findirect-inlining'
1      Inline also indirect calls that are discovered to be known at
1      compile time thanks to previous inlining.  This option has any
1      effect only when inlining itself is turned on by the
1      '-finline-functions' or '-finline-small-functions' options.
1 
1      Enabled at levels '-O3', '-Os'.  Also enabled by '-fprofile-use'
1      and '-fauto-profile'.
1 
1 '-finline-functions'
1      Consider all functions for inlining, even if they are not declared
1      inline.  The compiler heuristically decides which functions are
1      worth integrating in this way.
1 
1      If all calls to a given function are integrated, and the function
1      is declared 'static', then the function is normally not output as
1      assembler code in its own right.
1 
1      Enabled at levels '-O2', '-O3', '-Os'.
1 
1 '-finline-functions-called-once'
1      Consider all 'static' functions called once for inlining into their
1      caller even if they are not marked 'inline'.  If a call to a given
1      function is integrated, then the function is not output as
1      assembler code in its own right.
1 
1      Enabled at levels '-O1', '-O2', '-O3' and '-Os'.
1 
1 '-fearly-inlining'
1      Inline functions marked by 'always_inline' and functions whose body
1      seems smaller than the function call overhead early before doing
1      '-fprofile-generate' instrumentation and real inlining pass.  Doing
1      so makes profiling significantly cheaper and usually inlining
1      faster on programs having large chains of nested wrapper functions.
1 
1      Enabled by default.
1 
1 '-fipa-sra'
1      Perform interprocedural scalar replacement of aggregates, removal
1      of unused parameters and replacement of parameters passed by
1      reference by parameters passed by value.
1 
1      Enabled at levels '-O2', '-O3' and '-Os'.
1 
1 '-finline-limit=N'
1      By default, GCC limits the size of functions that can be inlined.
1      This flag allows coarse control of this limit.  N is the size of
1      functions that can be inlined in number of pseudo instructions.
1 
1      Inlining is actually controlled by a number of parameters, which
1      may be specified individually by using '--param NAME=VALUE'.  The
1      '-finline-limit=N' option sets some of these parameters as follows:
1 
1      'max-inline-insns-single'
1           is set to N/2.
1      'max-inline-insns-auto'
1           is set to N/2.
1 
1      See below for a documentation of the individual parameters
1      controlling inlining and for the defaults of these parameters.
1 
1      _Note:_ there may be no value to '-finline-limit' that results in
1      default behavior.
1 
1      _Note:_ pseudo instruction represents, in this particular context,
1      an abstract measurement of function's size.  In no way does it
1      represent a count of assembly instructions and as such its exact
1      meaning might change from one release to an another.
1 
1 '-fno-keep-inline-dllexport'
1      This is a more fine-grained version of '-fkeep-inline-functions',
1      which applies only to functions that are declared using the
11      'dllexport' attribute or declspec.  ⇒Declaring Attributes of
      Functions Function Attributes.
1 
1 '-fkeep-inline-functions'
1      In C, emit 'static' functions that are declared 'inline' into the
1      object file, even if the function has been inlined into all of its
1      callers.  This switch does not affect functions using the 'extern
1      inline' extension in GNU C90.  In C++, emit any and all inline
1      functions into the object file.
1 
1 '-fkeep-static-functions'
1      Emit 'static' functions into the object file, even if the function
1      is never used.
1 
1 '-fkeep-static-consts'
1      Emit variables declared 'static const' when optimization isn't
1      turned on, even if the variables aren't referenced.
1 
1      GCC enables this option by default.  If you want to force the
1      compiler to check if a variable is referenced, regardless of
1      whether or not optimization is turned on, use the
1      '-fno-keep-static-consts' option.
1 
1 '-fmerge-constants'
1      Attempt to merge identical constants (string constants and
1      floating-point constants) across compilation units.
1 
1      This option is the default for optimized compilation if the
1      assembler and linker support it.  Use '-fno-merge-constants' to
1      inhibit this behavior.
1 
1      Enabled at levels '-O', '-O2', '-O3', '-Os'.
1 
1 '-fmerge-all-constants'
1      Attempt to merge identical constants and identical variables.
1 
1      This option implies '-fmerge-constants'.  In addition to
1      '-fmerge-constants' this considers e.g. even constant initialized
1      arrays or initialized constant variables with integral or
1      floating-point types.  Languages like C or C++ require each
1      variable, including multiple instances of the same variable in
1      recursive calls, to have distinct locations, so using this option
1      results in non-conforming behavior.
1 
1 '-fmodulo-sched'
1      Perform swing modulo scheduling immediately before the first
1      scheduling pass.  This pass looks at innermost loops and reorders
1      their instructions by overlapping different iterations.
1 
1 '-fmodulo-sched-allow-regmoves'
1      Perform more aggressive SMS-based modulo scheduling with register
1      moves allowed.  By setting this flag certain anti-dependences edges
1      are deleted, which triggers the generation of reg-moves based on
1      the life-range analysis.  This option is effective only with
1      '-fmodulo-sched' enabled.
1 
1 '-fno-branch-count-reg'
1      Avoid running a pass scanning for opportunities to use "decrement
1      and branch" instructions on a count register instead of generating
1      sequences of instructions that decrement a register, compare it
1      against zero, and then branch based upon the result.  This option
1      is only meaningful on architectures that support such instructions,
1      which include x86, PowerPC, IA-64 and S/390.  Note that the
1      '-fno-branch-count-reg' option doesn't remove the decrement and
1      branch instructions from the generated instruction stream
1      introduced by other optimization passes.
1 
1      Enabled by default at '-O1' and higher.
1 
1      The default is '-fbranch-count-reg'.
1 
1 '-fno-function-cse'
1      Do not put function addresses in registers; make each instruction
1      that calls a constant function contain the function's address
1      explicitly.
1 
1      This option results in less efficient code, but some strange hacks
1      that alter the assembler output may be confused by the
1      optimizations performed when this option is not used.
1 
1      The default is '-ffunction-cse'
1 
1 '-fno-zero-initialized-in-bss'
1      If the target supports a BSS section, GCC by default puts variables
1      that are initialized to zero into BSS.  This can save space in the
1      resulting code.
1 
1      This option turns off this behavior because some programs
1      explicitly rely on variables going to the data section--e.g., so
1      that the resulting executable can find the beginning of that
1      section and/or make assumptions based on that.
1 
1      The default is '-fzero-initialized-in-bss'.
1 
1 '-fthread-jumps'
1      Perform optimizations that check to see if a jump branches to a
1      location where another comparison subsumed by the first is found.
1      If so, the first branch is redirected to either the destination of
1      the second branch or a point immediately following it, depending on
1      whether the condition is known to be true or false.
1 
1      Enabled at levels '-O2', '-O3', '-Os'.
1 
1 '-fsplit-wide-types'
1      When using a type that occupies multiple registers, such as 'long
1      long' on a 32-bit system, split the registers apart and allocate
1      them independently.  This normally generates better code for those
1      types, but may make debugging more difficult.
1 
1      Enabled at levels '-O', '-O2', '-O3', '-Os'.
1 
1 '-fcse-follow-jumps'
1      In common subexpression elimination (CSE), scan through jump
1      instructions when the target of the jump is not reached by any
1      other path.  For example, when CSE encounters an 'if' statement
1      with an 'else' clause, CSE follows the jump when the condition
1      tested is false.
1 
1      Enabled at levels '-O2', '-O3', '-Os'.
1 
1 '-fcse-skip-blocks'
1      This is similar to '-fcse-follow-jumps', but causes CSE to follow
1      jumps that conditionally skip over blocks.  When CSE encounters a
1      simple 'if' statement with no else clause, '-fcse-skip-blocks'
1      causes CSE to follow the jump around the body of the 'if'.
1 
1      Enabled at levels '-O2', '-O3', '-Os'.
1 
1 '-frerun-cse-after-loop'
1      Re-run common subexpression elimination after loop optimizations
1      are performed.
1 
1      Enabled at levels '-O2', '-O3', '-Os'.
1 
1 '-fgcse'
1      Perform a global common subexpression elimination pass.  This pass
1      also performs global constant and copy propagation.
1 
1      _Note:_ When compiling a program using computed gotos, a GCC
1      extension, you may get better run-time performance if you disable
1      the global common subexpression elimination pass by adding
1      '-fno-gcse' to the command line.
1 
1      Enabled at levels '-O2', '-O3', '-Os'.
1 
1 '-fgcse-lm'
1      When '-fgcse-lm' is enabled, global common subexpression
1      elimination attempts to move loads that are only killed by stores
1      into themselves.  This allows a loop containing a load/store
1      sequence to be changed to a load outside the loop, and a copy/store
1      within the loop.
1 
1      Enabled by default when '-fgcse' is enabled.
1 
1 '-fgcse-sm'
1      When '-fgcse-sm' is enabled, a store motion pass is run after
1      global common subexpression elimination.  This pass attempts to
1      move stores out of loops.  When used in conjunction with
1      '-fgcse-lm', loops containing a load/store sequence can be changed
1      to a load before the loop and a store after the loop.
1 
1      Not enabled at any optimization level.
1 
1 '-fgcse-las'
1      When '-fgcse-las' is enabled, the global common subexpression
1      elimination pass eliminates redundant loads that come after stores
1      to the same memory location (both partial and full redundancies).
1 
1      Not enabled at any optimization level.
1 
1 '-fgcse-after-reload'
1      When '-fgcse-after-reload' is enabled, a redundant load elimination
1      pass is performed after reload.  The purpose of this pass is to
1      clean up redundant spilling.
1 
1 '-faggressive-loop-optimizations'
1      This option tells the loop optimizer to use language constraints to
1      derive bounds for the number of iterations of a loop.  This assumes
1      that loop code does not invoke undefined behavior by for example
1      causing signed integer overflows or out-of-bound array accesses.
1      The bounds for the number of iterations of a loop are used to guide
1      loop unrolling and peeling and loop exit test optimizations.  This
1      option is enabled by default.
1 
1 '-funconstrained-commons'
1      This option tells the compiler that variables declared in common
1      blocks (e.g.  Fortran) may later be overridden with longer trailing
1      arrays.  This prevents certain optimizations that depend on knowing
1      the array bounds.
1 
1 '-fcrossjumping'
1      Perform cross-jumping transformation.  This transformation unifies
1      equivalent code and saves code size.  The resulting code may or may
1      not perform better than without cross-jumping.
1 
1      Enabled at levels '-O2', '-O3', '-Os'.
1 
1 '-fauto-inc-dec'
1      Combine increments or decrements of addresses with memory accesses.
1      This pass is always skipped on architectures that do not have
1      instructions to support this.  Enabled by default at '-O' and
1      higher on architectures that support this.
1 
1 '-fdce'
1      Perform dead code elimination (DCE) on RTL.  Enabled by default at
1      '-O' and higher.
1 
1 '-fdse'
1      Perform dead store elimination (DSE) on RTL.  Enabled by default at
1      '-O' and higher.
1 
1 '-fif-conversion'
1      Attempt to transform conditional jumps into branch-less
1      equivalents.  This includes use of conditional moves, min, max, set
1      flags and abs instructions, and some tricks doable by standard
1      arithmetics.  The use of conditional execution on chips where it is
1      available is controlled by '-fif-conversion2'.
1 
1      Enabled at levels '-O', '-O2', '-O3', '-Os'.
1 
1 '-fif-conversion2'
1      Use conditional execution (where available) to transform
1      conditional jumps into branch-less equivalents.
1 
1      Enabled at levels '-O', '-O2', '-O3', '-Os'.
1 
1 '-fdeclone-ctor-dtor'
1      The C++ ABI requires multiple entry points for constructors and
1      destructors: one for a base subobject, one for a complete object,
1      and one for a virtual destructor that calls operator delete
1      afterwards.  For a hierarchy with virtual bases, the base and
1      complete variants are clones, which means two copies of the
1      function.  With this option, the base and complete variants are
1      changed to be thunks that call a common implementation.
1 
1      Enabled by '-Os'.
1 
1 '-fdelete-null-pointer-checks'
1      Assume that programs cannot safely dereference null pointers, and
1      that no code or data element resides at address zero.  This option
1      enables simple constant folding optimizations at all optimization
1      levels.  In addition, other optimization passes in GCC use this
1      flag to control global dataflow analyses that eliminate useless
1      checks for null pointers; these assume that a memory access to
1      address zero always results in a trap, so that if a pointer is
1      checked after it has already been dereferenced, it cannot be null.
1 
1      Note however that in some environments this assumption is not true.
1      Use '-fno-delete-null-pointer-checks' to disable this optimization
1      for programs that depend on that behavior.
1 
1      This option is enabled by default on most targets.  On Nios II ELF,
1      it defaults to off.  On AVR, CR16, and MSP430, this option is
1      completely disabled.
1 
1      Passes that use the dataflow information are enabled independently
1      at different optimization levels.
1 
1 '-fdevirtualize'
1      Attempt to convert calls to virtual functions to direct calls.
1      This is done both within a procedure and interprocedurally as part
1      of indirect inlining ('-findirect-inlining') and interprocedural
1      constant propagation ('-fipa-cp').  Enabled at levels '-O2', '-O3',
1      '-Os'.
1 
1 '-fdevirtualize-speculatively'
1      Attempt to convert calls to virtual functions to speculative direct
1      calls.  Based on the analysis of the type inheritance graph,
1      determine for a given call the set of likely targets.  If the set
1      is small, preferably of size 1, change the call into a conditional
1      deciding between direct and indirect calls.  The speculative calls
1      enable more optimizations, such as inlining.  When they seem
1      useless after further optimization, they are converted back into
1      original form.
1 
1 '-fdevirtualize-at-ltrans'
1      Stream extra information needed for aggressive devirtualization
1      when running the link-time optimizer in local transformation mode.
1      This option enables more devirtualization but significantly
1      increases the size of streamed data.  For this reason it is
1      disabled by default.
1 
1 '-fexpensive-optimizations'
1      Perform a number of minor optimizations that are relatively
1      expensive.
1 
1      Enabled at levels '-O2', '-O3', '-Os'.
1 
1 '-free'
1      Attempt to remove redundant extension instructions.  This is
1      especially helpful for the x86-64 architecture, which implicitly
1      zero-extends in 64-bit registers after writing to their lower
1      32-bit half.
1 
1      Enabled for Alpha, AArch64 and x86 at levels '-O2', '-O3', '-Os'.
1 
1 '-fno-lifetime-dse'
1      In C++ the value of an object is only affected by changes within
1      its lifetime: when the constructor begins, the object has an
1      indeterminate value, and any changes during the lifetime of the
1      object are dead when the object is destroyed.  Normally dead store
1      elimination will take advantage of this; if your code relies on the
1      value of the object storage persisting beyond the lifetime of the
1      object, you can use this flag to disable this optimization.  To
1      preserve stores before the constructor starts (e.g.  because your
1      operator new clears the object storage) but still treat the object
1      as dead after the destructor you, can use '-flifetime-dse=1'.  The
1      default behavior can be explicitly selected with
1      '-flifetime-dse=2'.  '-flifetime-dse=0' is equivalent to
1      '-fno-lifetime-dse'.
1 
1 '-flive-range-shrinkage'
1      Attempt to decrease register pressure through register live range
1      shrinkage.  This is helpful for fast processors with small or
1      moderate size register sets.
1 
1 '-fira-algorithm=ALGORITHM'
1      Use the specified coloring algorithm for the integrated register
1      allocator.  The ALGORITHM argument can be 'priority', which
1      specifies Chow's priority coloring, or 'CB', which specifies
1      Chaitin-Briggs coloring.  Chaitin-Briggs coloring is not
1      implemented for all architectures, but for those targets that do
1      support it, it is the default because it generates better code.
1 
1 '-fira-region=REGION'
1      Use specified regions for the integrated register allocator.  The
1      REGION argument should be one of the following:
1 
1      'all'
1           Use all loops as register allocation regions.  This can give
1           the best results for machines with a small and/or irregular
1           register set.
1 
1      'mixed'
1           Use all loops except for loops with small register pressure as
1           the regions.  This value usually gives the best results in
1           most cases and for most architectures, and is enabled by
1           default when compiling with optimization for speed ('-O',
1           '-O2', ...).
1 
1      'one'
1           Use all functions as a single region.  This typically results
1           in the smallest code size, and is enabled by default for '-Os'
1           or '-O0'.
1 
1 '-fira-hoist-pressure'
1      Use IRA to evaluate register pressure in the code hoisting pass for
1      decisions to hoist expressions.  This option usually results in
1      smaller code, but it can slow the compiler down.
1 
1      This option is enabled at level '-Os' for all targets.
1 
1 '-fira-loop-pressure'
1      Use IRA to evaluate register pressure in loops for decisions to
1      move loop invariants.  This option usually results in generation of
1      faster and smaller code on machines with large register files (>=
1      32 registers), but it can slow the compiler down.
1 
1      This option is enabled at level '-O3' for some targets.
1 
1 '-fno-ira-share-save-slots'
1      Disable sharing of stack slots used for saving call-used hard
1      registers living through a call.  Each hard register gets a
1      separate stack slot, and as a result function stack frames are
1      larger.
1 
1 '-fno-ira-share-spill-slots'
1      Disable sharing of stack slots allocated for pseudo-registers.
1      Each pseudo-register that does not get a hard register gets a
1      separate stack slot, and as a result function stack frames are
1      larger.
1 
1 '-flra-remat'
1      Enable CFG-sensitive rematerialization in LRA. Instead of loading
1      values of spilled pseudos, LRA tries to rematerialize (recalculate)
1      values if it is profitable.
1 
1      Enabled at levels '-O2', '-O3', '-Os'.
1 
1 '-fdelayed-branch'
1      If supported for the target machine, attempt to reorder
1      instructions to exploit instruction slots available after delayed
1      branch instructions.
1 
1      Enabled at levels '-O', '-O2', '-O3', '-Os'.
1 
1 '-fschedule-insns'
1      If supported for the target machine, attempt to reorder
1      instructions to eliminate execution stalls due to required data
1      being unavailable.  This helps machines that have slow floating
1      point or memory load instructions by allowing other instructions to
1      be issued until the result of the load or floating-point
1      instruction is required.
1 
1      Enabled at levels '-O2', '-O3'.
1 
1 '-fschedule-insns2'
1      Similar to '-fschedule-insns', but requests an additional pass of
1      instruction scheduling after register allocation has been done.
1      This is especially useful on machines with a relatively small
1      number of registers and where memory load instructions take more
1      than one cycle.
1 
1      Enabled at levels '-O2', '-O3', '-Os'.
1 
1 '-fno-sched-interblock'
1      Don't schedule instructions across basic blocks.  This is normally
1      enabled by default when scheduling before register allocation, i.e.
1      with '-fschedule-insns' or at '-O2' or higher.
1 
1 '-fno-sched-spec'
1      Don't allow speculative motion of non-load instructions.  This is
1      normally enabled by default when scheduling before register
1      allocation, i.e. with '-fschedule-insns' or at '-O2' or higher.
1 
1 '-fsched-pressure'
1      Enable register pressure sensitive insn scheduling before register
1      allocation.  This only makes sense when scheduling before register
1      allocation is enabled, i.e. with '-fschedule-insns' or at '-O2' or
1      higher.  Usage of this option can improve the generated code and
1      decrease its size by preventing register pressure increase above
1      the number of available hard registers and subsequent spills in
1      register allocation.
1 
1 '-fsched-spec-load'
1      Allow speculative motion of some load instructions.  This only
1      makes sense when scheduling before register allocation, i.e. with
1      '-fschedule-insns' or at '-O2' or higher.
1 
1 '-fsched-spec-load-dangerous'
1      Allow speculative motion of more load instructions.  This only
1      makes sense when scheduling before register allocation, i.e. with
1      '-fschedule-insns' or at '-O2' or higher.
1 
1 '-fsched-stalled-insns'
1 '-fsched-stalled-insns=N'
1      Define how many insns (if any) can be moved prematurely from the
1      queue of stalled insns into the ready list during the second
1      scheduling pass.  '-fno-sched-stalled-insns' means that no insns
1      are moved prematurely, '-fsched-stalled-insns=0' means there is no
1      limit on how many queued insns can be moved prematurely.
1      '-fsched-stalled-insns' without a value is equivalent to
1      '-fsched-stalled-insns=1'.
1 
1 '-fsched-stalled-insns-dep'
1 '-fsched-stalled-insns-dep=N'
1      Define how many insn groups (cycles) are examined for a dependency
1      on a stalled insn that is a candidate for premature removal from
1      the queue of stalled insns.  This has an effect only during the
1      second scheduling pass, and only if '-fsched-stalled-insns' is
1      used.  '-fno-sched-stalled-insns-dep' is equivalent to
1      '-fsched-stalled-insns-dep=0'.  '-fsched-stalled-insns-dep' without
1      a value is equivalent to '-fsched-stalled-insns-dep=1'.
1 
1 '-fsched2-use-superblocks'
1      When scheduling after register allocation, use superblock
1      scheduling.  This allows motion across basic block boundaries,
1      resulting in faster schedules.  This option is experimental, as not
1      all machine descriptions used by GCC model the CPU closely enough
1      to avoid unreliable results from the algorithm.
1 
1      This only makes sense when scheduling after register allocation,
1      i.e. with '-fschedule-insns2' or at '-O2' or higher.
1 
1 '-fsched-group-heuristic'
1      Enable the group heuristic in the scheduler.  This heuristic favors
1      the instruction that belongs to a schedule group.  This is enabled
1      by default when scheduling is enabled, i.e. with '-fschedule-insns'
1      or '-fschedule-insns2' or at '-O2' or higher.
1 
1 '-fsched-critical-path-heuristic'
1      Enable the critical-path heuristic in the scheduler.  This
1      heuristic favors instructions on the critical path.  This is
1      enabled by default when scheduling is enabled, i.e. with
1      '-fschedule-insns' or '-fschedule-insns2' or at '-O2' or higher.
1 
1 '-fsched-spec-insn-heuristic'
1      Enable the speculative instruction heuristic in the scheduler.
1      This heuristic favors speculative instructions with greater
1      dependency weakness.  This is enabled by default when scheduling is
1      enabled, i.e. with '-fschedule-insns' or '-fschedule-insns2' or at
1      '-O2' or higher.
1 
1 '-fsched-rank-heuristic'
1      Enable the rank heuristic in the scheduler.  This heuristic favors
1      the instruction belonging to a basic block with greater size or
1      frequency.  This is enabled by default when scheduling is enabled,
1      i.e. with '-fschedule-insns' or '-fschedule-insns2' or at '-O2' or
1      higher.
1 
1 '-fsched-last-insn-heuristic'
1      Enable the last-instruction heuristic in the scheduler.  This
1      heuristic favors the instruction that is less dependent on the last
1      instruction scheduled.  This is enabled by default when scheduling
1      is enabled, i.e. with '-fschedule-insns' or '-fschedule-insns2' or
1      at '-O2' or higher.
1 
1 '-fsched-dep-count-heuristic'
1      Enable the dependent-count heuristic in the scheduler.  This
1      heuristic favors the instruction that has more instructions
1      depending on it.  This is enabled by default when scheduling is
1      enabled, i.e. with '-fschedule-insns' or '-fschedule-insns2' or at
1      '-O2' or higher.
1 
1 '-freschedule-modulo-scheduled-loops'
1      Modulo scheduling is performed before traditional scheduling.  If a
1      loop is modulo scheduled, later scheduling passes may change its
1      schedule.  Use this option to control that behavior.
1 
1 '-fselective-scheduling'
1      Schedule instructions using selective scheduling algorithm.
1      Selective scheduling runs instead of the first scheduler pass.
1 
1 '-fselective-scheduling2'
1      Schedule instructions using selective scheduling algorithm.
1      Selective scheduling runs instead of the second scheduler pass.
1 
1 '-fsel-sched-pipelining'
1      Enable software pipelining of innermost loops during selective
1      scheduling.  This option has no effect unless one of
1      '-fselective-scheduling' or '-fselective-scheduling2' is turned on.
1 
1 '-fsel-sched-pipelining-outer-loops'
1      When pipelining loops during selective scheduling, also pipeline
1      outer loops.  This option has no effect unless
1      '-fsel-sched-pipelining' is turned on.
1 
1 '-fsemantic-interposition'
1      Some object formats, like ELF, allow interposing of symbols by the
1      dynamic linker.  This means that for symbols exported from the DSO,
1      the compiler cannot perform interprocedural propagation, inlining
1      and other optimizations in anticipation that the function or
1      variable in question may change.  While this feature is useful, for
1      example, to rewrite memory allocation functions by a debugging
1      implementation, it is expensive in the terms of code quality.  With
1      '-fno-semantic-interposition' the compiler assumes that if
1      interposition happens for functions the overwriting function will
1      have precisely the same semantics (and side effects).  Similarly if
1      interposition happens for variables, the constructor of the
1      variable will be the same.  The flag has no effect for functions
1      explicitly declared inline (where it is never allowed for
1      interposition to change semantics) and for symbols explicitly
1      declared weak.
1 
1 '-fshrink-wrap'
1      Emit function prologues only before parts of the function that need
1      it, rather than at the top of the function.  This flag is enabled
1      by default at '-O' and higher.
1 
1 '-fshrink-wrap-separate'
1      Shrink-wrap separate parts of the prologue and epilogue separately,
1      so that those parts are only executed when needed.  This option is
1      on by default, but has no effect unless '-fshrink-wrap' is also
1      turned on and the target supports this.
1 
1 '-fcaller-saves'
1      Enable allocation of values to registers that are clobbered by
1      function calls, by emitting extra instructions to save and restore
1      the registers around such calls.  Such allocation is done only when
1      it seems to result in better code.
1 
1      This option is always enabled by default on certain machines,
1      usually those which have no call-preserved registers to use
1      instead.
1 
1      Enabled at levels '-O2', '-O3', '-Os'.
1 
1 '-fcombine-stack-adjustments'
1      Tracks stack adjustments (pushes and pops) and stack memory
1      references and then tries to find ways to combine them.
1 
1      Enabled by default at '-O1' and higher.
1 
1 '-fipa-ra'
1      Use caller save registers for allocation if those registers are not
1      used by any called function.  In that case it is not necessary to
1      save and restore them around calls.  This is only possible if
1      called functions are part of same compilation unit as current
1      function and they are compiled before it.
1 
1      Enabled at levels '-O2', '-O3', '-Os', however the option is
1      disabled if generated code will be instrumented for profiling
1      ('-p', or '-pg') or if callee's register usage cannot be known
1      exactly (this happens on targets that do not expose prologues and
1      epilogues in RTL).
1 
1 '-fconserve-stack'
1      Attempt to minimize stack usage.  The compiler attempts to use less
1      stack space, even if that makes the program slower.  This option
1      implies setting the 'large-stack-frame' parameter to 100 and the
1      'large-stack-frame-growth' parameter to 400.
1 
1 '-ftree-reassoc'
1      Perform reassociation on trees.  This flag is enabled by default at
1      '-O' and higher.
1 
1 '-fcode-hoisting'
1      Perform code hoisting.  Code hoisting tries to move the evaluation
1      of expressions executed on all paths to the function exit as early
1      as possible.  This is especially useful as a code size
1      optimization, but it often helps for code speed as well.  This flag
1      is enabled by default at '-O2' and higher.
1 
1 '-ftree-pre'
1      Perform partial redundancy elimination (PRE) on trees.  This flag
1      is enabled by default at '-O2' and '-O3'.
1 
1 '-ftree-partial-pre'
1      Make partial redundancy elimination (PRE) more aggressive.  This
1      flag is enabled by default at '-O3'.
1 
1 '-ftree-forwprop'
1      Perform forward propagation on trees.  This flag is enabled by
1      default at '-O' and higher.
1 
1 '-ftree-fre'
1      Perform full redundancy elimination (FRE) on trees.  The difference
1      between FRE and PRE is that FRE only considers expressions that are
1      computed on all paths leading to the redundant computation.  This
1      analysis is faster than PRE, though it exposes fewer redundancies.
1      This flag is enabled by default at '-O' and higher.
1 
1 '-ftree-phiprop'
1      Perform hoisting of loads from conditional pointers on trees.  This
1      pass is enabled by default at '-O' and higher.
1 
1 '-fhoist-adjacent-loads'
1      Speculatively hoist loads from both branches of an if-then-else if
1      the loads are from adjacent locations in the same structure and the
1      target architecture has a conditional move instruction.  This flag
1      is enabled by default at '-O2' and higher.
1 
1 '-ftree-copy-prop'
1      Perform copy propagation on trees.  This pass eliminates
1      unnecessary copy operations.  This flag is enabled by default at
1      '-O' and higher.
1 
1 '-fipa-pure-const'
1      Discover which functions are pure or constant.  Enabled by default
1      at '-O' and higher.
1 
1 '-fipa-reference'
1      Discover which static variables do not escape the compilation unit.
1      Enabled by default at '-O' and higher.
1 
1 '-fipa-pta'
1      Perform interprocedural pointer analysis and interprocedural
1      modification and reference analysis.  This option can cause
1      excessive memory and compile-time usage on large compilation units.
1      It is not enabled by default at any optimization level.
1 
1 '-fipa-profile'
1      Perform interprocedural profile propagation.  The functions called
1      only from cold functions are marked as cold.  Also functions
1      executed once (such as 'cold', 'noreturn', static constructors or
1      destructors) are identified.  Cold functions and loop less parts of
1      functions executed once are then optimized for size.  Enabled by
1      default at '-O' and higher.
1 
1 '-fipa-cp'
1      Perform interprocedural constant propagation.  This optimization
1      analyzes the program to determine when values passed to functions
1      are constants and then optimizes accordingly.  This optimization
1      can substantially increase performance if the application has
1      constants passed to functions.  This flag is enabled by default at
1      '-O2', '-Os' and '-O3'.
1 
1 '-fipa-cp-clone'
1      Perform function cloning to make interprocedural constant
1      propagation stronger.  When enabled, interprocedural constant
1      propagation performs function cloning when externally visible
1      function can be called with constant arguments.  Because this
1      optimization can create multiple copies of functions, it may
1      significantly increase code size (see '--param
1      ipcp-unit-growth=VALUE').  This flag is enabled by default at
1      '-O3'.
1 
1 '-fipa-bit-cp'
1      When enabled, perform interprocedural bitwise constant propagation.
1      This flag is enabled by default at '-O2'.  It requires that
1      '-fipa-cp' is enabled.
1 
1 '-fipa-vrp'
1      When enabled, perform interprocedural propagation of value ranges.
1      This flag is enabled by default at '-O2'.  It requires that
1      '-fipa-cp' is enabled.
1 
1 '-fipa-icf'
1      Perform Identical Code Folding for functions and read-only
1      variables.  The optimization reduces code size and may disturb
1      unwind stacks by replacing a function by equivalent one with a
1      different name.  The optimization works more effectively with
1      link-time optimization enabled.
1 
1      Nevertheless the behavior is similar to Gold Linker ICF
1      optimization, GCC ICF works on different levels and thus the
1      optimizations are not same - there are equivalences that are found
1      only by GCC and equivalences found only by Gold.
1 
1      This flag is enabled by default at '-O2' and '-Os'.
1 
1 '-flive-patching=LEVEL'
1      Control GCC's optimizations to produce output suitable for
1      live-patching.
1 
1      If the compiler's optimization uses a function's body or
1      information extracted from its body to optimize/change another
1      function, the latter is called an impacted function of the former.
1      If a function is patched, its impacted functions should be patched
1      too.
1 
1      The impacted functions are determined by the compiler's
1      interprocedural optimizations.  For example, a caller is impacted
1      when inlining a function into its caller, cloning a function and
1      changing its caller to call this new clone, or extracting a
1      function's pureness/constness information to optimize its direct or
1      indirect callers, etc.
1 
1      Usually, the more IPA optimizations enabled, the larger the number
1      of impacted functions for each function.  In order to control the
1      number of impacted functions and more easily compute the list of
1      impacted function, IPA optimizations can be partially enabled at
1      two different levels.
1 
1      The LEVEL argument should be one of the following:
1 
1      'inline-clone'
1 
1           Only enable inlining and cloning optimizations, which includes
1           inlining, cloning, interprocedural scalar replacement of
1           aggregates and partial inlining.  As a result, when patching a
1           function, all its callers and its clones' callers are
1           impacted, therefore need to be patched as well.
1 
1           '-flive-patching=inline-clone' disables the following
1           optimization flags:
1                -fwhole-program  -fipa-pta  -fipa-reference  -fipa-ra
1                -fipa-icf  -fipa-icf-functions  -fipa-icf-variables
1                -fipa-bit-cp  -fipa-vrp  -fipa-pure-const  -fipa-reference-addressable
1                -fipa-stack-alignment
1 
1      'inline-only-static'
1 
1           Only enable inlining of static functions.  As a result, when
1           patching a static function, all its callers are impacted and
1           so need to be patched as well.
1 
1           In addition to all the flags that
1           '-flive-patching=inline-clone' disables,
1           '-flive-patching=inline-only-static' disables the following
1           additional optimization flags:
1                -fipa-cp-clone  -fipa-sra  -fpartial-inlining  -fipa-cp
1 
1      When '-flive-patching' is specified without any value, the default
1      value is INLINE-CLONE.
1 
1      This flag is disabled by default.
1 
1      Note that '-flive-patching' is not supported with link-time
1      optimization ('-flto').
1 
1 '-fisolate-erroneous-paths-dereference'
1      Detect paths that trigger erroneous or undefined behavior due to
1      dereferencing a null pointer.  Isolate those paths from the main
1      control flow and turn the statement with erroneous or undefined
1      behavior into a trap.  This flag is enabled by default at '-O2' and
1      higher and depends on '-fdelete-null-pointer-checks' also being
1      enabled.
1 
1 '-fisolate-erroneous-paths-attribute'
1      Detect paths that trigger erroneous or undefined behavior due to a
1      null value being used in a way forbidden by a 'returns_nonnull' or
1      'nonnull' attribute.  Isolate those paths from the main control
1      flow and turn the statement with erroneous or undefined behavior
1      into a trap.  This is not currently enabled, but may be enabled by
1      '-O2' in the future.
1 
1 '-ftree-sink'
1      Perform forward store motion on trees.  This flag is enabled by
1      default at '-O' and higher.
1 
1 '-ftree-bit-ccp'
1      Perform sparse conditional bit constant propagation on trees and
1      propagate pointer alignment information.  This pass only operates
1      on local scalar variables and is enabled by default at '-O' and
1      higher.  It requires that '-ftree-ccp' is enabled.
1 
1 '-ftree-ccp'
1      Perform sparse conditional constant propagation (CCP) on trees.
1      This pass only operates on local scalar variables and is enabled by
1      default at '-O' and higher.
1 
1 '-fssa-backprop'
1      Propagate information about uses of a value up the definition chain
1      in order to simplify the definitions.  For example, this pass
1      strips sign operations if the sign of a value never matters.  The
1      flag is enabled by default at '-O' and higher.
1 
1 '-fssa-phiopt'
1      Perform pattern matching on SSA PHI nodes to optimize conditional
1      code.  This pass is enabled by default at '-O' and higher.
1 
1 '-ftree-switch-conversion'
1      Perform conversion of simple initializations in a switch to
1      initializations from a scalar array.  This flag is enabled by
1      default at '-O2' and higher.
1 
1 '-ftree-tail-merge'
1      Look for identical code sequences.  When found, replace one with a
1      jump to the other.  This optimization is known as tail merging or
1      cross jumping.  This flag is enabled by default at '-O2' and
1      higher.  The compilation time in this pass can be limited using
1      'max-tail-merge-comparisons' parameter and
1      'max-tail-merge-iterations' parameter.
1 
1 '-ftree-dce'
1      Perform dead code elimination (DCE) on trees.  This flag is enabled
1      by default at '-O' and higher.
1 
1 '-ftree-builtin-call-dce'
1      Perform conditional dead code elimination (DCE) for calls to
1      built-in functions that may set 'errno' but are otherwise free of
1      side effects.  This flag is enabled by default at '-O2' and higher
1      if '-Os' is not also specified.
1 
1 '-ftree-dominator-opts'
1      Perform a variety of simple scalar cleanups (constant/copy
1      propagation, redundancy elimination, range propagation and
1      expression simplification) based on a dominator tree traversal.
1      This also performs jump threading (to reduce jumps to jumps).  This
1      flag is enabled by default at '-O' and higher.
1 
1 '-ftree-dse'
1      Perform dead store elimination (DSE) on trees.  A dead store is a
1      store into a memory location that is later overwritten by another
1      store without any intervening loads.  In this case the earlier
1      store can be deleted.  This flag is enabled by default at '-O' and
1      higher.
1 
1 '-ftree-ch'
1      Perform loop header copying on trees.  This is beneficial since it
1      increases effectiveness of code motion optimizations.  It also
1      saves one jump.  This flag is enabled by default at '-O' and
1      higher.  It is not enabled for '-Os', since it usually increases
1      code size.
1 
1 '-ftree-loop-optimize'
1      Perform loop optimizations on trees.  This flag is enabled by
1      default at '-O' and higher.
1 
1 '-ftree-loop-linear'
1 '-floop-strip-mine'
1 '-floop-block'
1      Perform loop nest optimizations.  Same as '-floop-nest-optimize'.
1      To use this code transformation, GCC has to be configured with
1      '--with-isl' to enable the Graphite loop transformation
1      infrastructure.
1 
1 '-fgraphite-identity'
1      Enable the identity transformation for graphite.  For every SCoP we
1      generate the polyhedral representation and transform it back to
1      gimple.  Using '-fgraphite-identity' we can check the costs or
1      benefits of the GIMPLE -> GRAPHITE -> GIMPLE transformation.  Some
1      minimal optimizations are also performed by the code generator isl,
1      like index splitting and dead code elimination in loops.
1 
1 '-floop-nest-optimize'
1      Enable the isl based loop nest optimizer.  This is a generic loop
1      nest optimizer based on the Pluto optimization algorithms.  It
1      calculates a loop structure optimized for data-locality and
1      parallelism.  This option is experimental.
1 
1 '-floop-parallelize-all'
1      Use the Graphite data dependence analysis to identify loops that
1      can be parallelized.  Parallelize all the loops that can be
1      analyzed to not contain loop carried dependences without checking
1      that it is profitable to parallelize the loops.
1 
1 '-ftree-coalesce-vars'
1      While transforming the program out of the SSA representation,
1      attempt to reduce copying by coalescing versions of different
1      user-defined variables, instead of just compiler temporaries.  This
1      may severely limit the ability to debug an optimized program
1      compiled with '-fno-var-tracking-assignments'.  In the negated
1      form, this flag prevents SSA coalescing of user variables.  This
1      option is enabled by default if optimization is enabled, and it
1      does very little otherwise.
1 
1 '-ftree-loop-if-convert'
1      Attempt to transform conditional jumps in the innermost loops to
1      branch-less equivalents.  The intent is to remove control-flow from
1      the innermost loops in order to improve the ability of the
1      vectorization pass to handle these loops.  This is enabled by
1      default if vectorization is enabled.
1 
1 '-ftree-loop-distribution'
1      Perform loop distribution.  This flag can improve cache performance
1      on big loop bodies and allow further loop optimizations, like
1      parallelization or vectorization, to take place.  For example, the
1      loop
1           DO I = 1, N
1             A(I) = B(I) + C
1             D(I) = E(I) * F
1           ENDDO
1      is transformed to
1           DO I = 1, N
1              A(I) = B(I) + C
1           ENDDO
1           DO I = 1, N
1              D(I) = E(I) * F
1           ENDDO
1 
1 '-ftree-loop-distribute-patterns'
1      Perform loop distribution of patterns that can be code generated
1      with calls to a library.  This flag is enabled by default at '-O3'.
1 
1      This pass distributes the initialization loops and generates a call
1      to memset zero.  For example, the loop
1           DO I = 1, N
1             A(I) = 0
1             B(I) = A(I) + I
1           ENDDO
1      is transformed to
1           DO I = 1, N
1              A(I) = 0
1           ENDDO
1           DO I = 1, N
1              B(I) = A(I) + I
1           ENDDO
1      and the initialization loop is transformed into a call to memset
1      zero.
1 
1 '-floop-interchange'
1      Perform loop interchange outside of graphite.  This flag can
1      improve cache performance on loop nest and allow further loop
1      optimizations, like vectorization, to take place.  For example, the
1      loop
1           for (int i = 0; i < N; i++)
1             for (int j = 0; j < N; j++)
1               for (int k = 0; k < N; k++)
1                 c[i][j] = c[i][j] + a[i][k]*b[k][j];
1      is transformed to
1           for (int i = 0; i < N; i++)
1             for (int k = 0; k < N; k++)
1               for (int j = 0; j < N; j++)
1                 c[i][j] = c[i][j] + a[i][k]*b[k][j];
1      This flag is enabled by default at '-O3'.
1 
1 '-floop-unroll-and-jam'
1      Apply unroll and jam transformations on feasible loops.  In a loop
1      nest this unrolls the outer loop by some factor and fuses the
1      resulting multiple inner loops.  This flag is enabled by default at
1      '-O3'.
1 
1 '-ftree-loop-im'
1      Perform loop invariant motion on trees.  This pass moves only
1      invariants that are hard to handle at RTL level (function calls,
1      operations that expand to nontrivial sequences of insns).  With
1      '-funswitch-loops' it also moves operands of conditions that are
1      invariant out of the loop, so that we can use just trivial
1      invariantness analysis in loop unswitching.  The pass also includes
1      store motion.
1 
1 '-ftree-loop-ivcanon'
1      Create a canonical counter for number of iterations in loops for
1      which determining number of iterations requires complicated
1      analysis.  Later optimizations then may determine the number
1      easily.  Useful especially in connection with unrolling.
1 
1 '-fivopts'
1      Perform induction variable optimizations (strength reduction,
1      induction variable merging and induction variable elimination) on
1      trees.
1 
1 '-ftree-parallelize-loops=n'
1      Parallelize loops, i.e., split their iteration space to run in n
1      threads.  This is only possible for loops whose iterations are
1      independent and can be arbitrarily reordered.  The optimization is
1      only profitable on multiprocessor machines, for loops that are
1      CPU-intensive, rather than constrained e.g. by memory bandwidth.
1      This option implies '-pthread', and thus is only supported on
1      targets that have support for '-pthread'.
1 
1 '-ftree-pta'
1      Perform function-local points-to analysis on trees.  This flag is
1      enabled by default at '-O' and higher.
1 
1 '-ftree-sra'
1      Perform scalar replacement of aggregates.  This pass replaces
1      structure references with scalars to prevent committing structures
1      to memory too early.  This flag is enabled by default at '-O' and
1      higher.
1 
1 '-fstore-merging'
1      Perform merging of narrow stores to consecutive memory addresses.
1      This pass merges contiguous stores of immediate values narrower
1      than a word into fewer wider stores to reduce the number of
1      instructions.  This is enabled by default at '-O2' and higher as
1      well as '-Os'.
1 
1 '-ftree-ter'
1      Perform temporary expression replacement during the SSA->normal
1      phase.  Single use/single def temporaries are replaced at their use
1      location with their defining expression.  This results in
1      non-GIMPLE code, but gives the expanders much more complex trees to
1      work on resulting in better RTL generation.  This is enabled by
1      default at '-O' and higher.
1 
1 '-ftree-slsr'
1      Perform straight-line strength reduction on trees.  This recognizes
1      related expressions involving multiplications and replaces them by
1      less expensive calculations when possible.  This is enabled by
1      default at '-O' and higher.
1 
1 '-ftree-vectorize'
1      Perform vectorization on trees.  This flag enables
1      '-ftree-loop-vectorize' and '-ftree-slp-vectorize' if not
1      explicitly specified.
1 
1 '-ftree-loop-vectorize'
1      Perform loop vectorization on trees.  This flag is enabled by
1      default at '-O3' and when '-ftree-vectorize' is enabled.
1 
1 '-ftree-slp-vectorize'
1      Perform basic block vectorization on trees.  This flag is enabled
1      by default at '-O3' and when '-ftree-vectorize' is enabled.
1 
1 '-fvect-cost-model=MODEL'
1      Alter the cost model used for vectorization.  The MODEL argument
1      should be one of 'unlimited', 'dynamic' or 'cheap'.  With the
1      'unlimited' model the vectorized code-path is assumed to be
1      profitable while with the 'dynamic' model a runtime check guards
1      the vectorized code-path to enable it only for iteration counts
1      that will likely execute faster than when executing the original
1      scalar loop.  The 'cheap' model disables vectorization of loops
1      where doing so would be cost prohibitive for example due to
1      required runtime checks for data dependence or alignment but
1      otherwise is equal to the 'dynamic' model.  The default cost model
1      depends on other optimization flags and is either 'dynamic' or
1      'cheap'.
1 
1 '-fsimd-cost-model=MODEL'
1      Alter the cost model used for vectorization of loops marked with
1      the OpenMP simd directive.  The MODEL argument should be one of
1      'unlimited', 'dynamic', 'cheap'.  All values of MODEL have the same
1      meaning as described in '-fvect-cost-model' and by default a cost
1      model defined with '-fvect-cost-model' is used.
1 
1 '-ftree-vrp'
1      Perform Value Range Propagation on trees.  This is similar to the
1      constant propagation pass, but instead of values, ranges of values
1      are propagated.  This allows the optimizers to remove unnecessary
1      range checks like array bound checks and null pointer checks.  This
1      is enabled by default at '-O2' and higher.  Null pointer check
1      elimination is only done if '-fdelete-null-pointer-checks' is
1      enabled.
1 
1 '-fsplit-paths'
1      Split paths leading to loop backedges.  This can improve dead code
1      elimination and common subexpression elimination.  This is enabled
1      by default at '-O2' and above.
1 
1 '-fsplit-ivs-in-unroller'
1      Enables expression of values of induction variables in later
1      iterations of the unrolled loop using the value in the first
1      iteration.  This breaks long dependency chains, thus improving
1      efficiency of the scheduling passes.
1 
1      A combination of '-fweb' and CSE is often sufficient to obtain the
1      same effect.  However, that is not reliable in cases where the loop
1      body is more complicated than a single basic block.  It also does
1      not work at all on some architectures due to restrictions in the
1      CSE pass.
1 
1      This optimization is enabled by default.
1 
1 '-fvariable-expansion-in-unroller'
1      With this option, the compiler creates multiple copies of some
1      local variables when unrolling a loop, which can result in superior
1      code.
1 
1 '-fpartial-inlining'
1      Inline parts of functions.  This option has any effect only when
1      inlining itself is turned on by the '-finline-functions' or
1      '-finline-small-functions' options.
1 
1      Enabled at levels '-O2', '-O3', '-Os'.
1 
1 '-fpredictive-commoning'
1      Perform predictive commoning optimization, i.e., reusing
1      computations (especially memory loads and stores) performed in
1      previous iterations of loops.
1 
1      This option is enabled at level '-O3'.
1 
1 '-fprefetch-loop-arrays'
1      If supported by the target machine, generate instructions to
1      prefetch memory to improve the performance of loops that access
1      large arrays.
1 
1      This option may generate better or worse code; results are highly
1      dependent on the structure of loops within the source code.
1 
1      Disabled at level '-Os'.
1 
1 '-fno-printf-return-value'
1      Do not substitute constants for known return value of formatted
1      output functions such as 'sprintf', 'snprintf', 'vsprintf', and
1      'vsnprintf' (but not 'printf' of 'fprintf').  This transformation
1      allows GCC to optimize or even eliminate branches based on the
1      known return value of these functions called with arguments that
1      are either constant, or whose values are known to be in a range
1      that makes determining the exact return value possible.  For
1      example, when '-fprintf-return-value' is in effect, both the branch
1      and the body of the 'if' statement (but not the call to 'snprint')
1      can be optimized away when 'i' is a 32-bit or smaller integer
1      because the return value is guaranteed to be at most 8.
1 
1           char buf[9];
1           if (snprintf (buf, "%08x", i) >= sizeof buf)
1             ...
1 
1      The '-fprintf-return-value' option relies on other optimizations
1      and yields best results with '-O2' and above.  It works in tandem
1      with the '-Wformat-overflow' and '-Wformat-truncation' options.
1      The '-fprintf-return-value' option is enabled by default.
1 
1 '-fno-peephole'
1 '-fno-peephole2'
1      Disable any machine-specific peephole optimizations.  The
1      difference between '-fno-peephole' and '-fno-peephole2' is in how
1      they are implemented in the compiler; some targets use one, some
1      use the other, a few use both.
1 
1      '-fpeephole' is enabled by default.  '-fpeephole2' enabled at
1      levels '-O2', '-O3', '-Os'.
1 
1 '-fno-guess-branch-probability'
1      Do not guess branch probabilities using heuristics.
1 
1      GCC uses heuristics to guess branch probabilities if they are not
1      provided by profiling feedback ('-fprofile-arcs').  These
1      heuristics are based on the control flow graph.  If some branch
1      probabilities are specified by '__builtin_expect', then the
1      heuristics are used to guess branch probabilities for the rest of
1      the control flow graph, taking the '__builtin_expect' info into
1      account.  The interactions between the heuristics and
1      '__builtin_expect' can be complex, and in some cases, it may be
1      useful to disable the heuristics so that the effects of
1      '__builtin_expect' are easier to understand.
1 
1      The default is '-fguess-branch-probability' at levels '-O', '-O2',
1      '-O3', '-Os'.
1 
1 '-freorder-blocks'
1      Reorder basic blocks in the compiled function in order to reduce
1      number of taken branches and improve code locality.
1 
1      Enabled at levels '-O', '-O2', '-O3', '-Os'.
1 
1 '-freorder-blocks-algorithm=ALGORITHM'
1      Use the specified algorithm for basic block reordering.  The
1      ALGORITHM argument can be 'simple', which does not increase code
1      size (except sometimes due to secondary effects like alignment), or
1      'stc', the "software trace cache" algorithm, which tries to put all
1      often executed code together, minimizing the number of branches
1      executed by making extra copies of code.
1 
1      The default is 'simple' at levels '-O', '-Os', and 'stc' at levels
1      '-O2', '-O3'.
1 
1 '-freorder-blocks-and-partition'
1      In addition to reordering basic blocks in the compiled function, in
1      order to reduce number of taken branches, partitions hot and cold
1      basic blocks into separate sections of the assembly and '.o' files,
1      to improve paging and cache locality performance.
1 
1      This optimization is automatically turned off in the presence of
1      exception handling or unwind tables (on targets using
1      setjump/longjump or target specific scheme), for linkonce sections,
1      for functions with a user-defined section attribute and on any
1      architecture that does not support named sections.  When
1      '-fsplit-stack' is used this option is not enabled by default (to
1      avoid linker errors), but may be enabled explicitly (if using a
1      working linker).
1 
1      Enabled for x86 at levels '-O2', '-O3', '-Os'.
1 
1 '-freorder-functions'
1      Reorder functions in the object file in order to improve code
1      locality.  This is implemented by using special subsections
1      '.text.hot' for most frequently executed functions and
1      '.text.unlikely' for unlikely executed functions.  Reordering is
1      done by the linker so object file format must support named
1      sections and linker must place them in a reasonable way.
1 
1      Also profile feedback must be available to make this option
1      effective.  See '-fprofile-arcs' for details.
1 
1      Enabled at levels '-O2', '-O3', '-Os'.
1 
1 '-fstrict-aliasing'
1      Allow the compiler to assume the strictest aliasing rules
1      applicable to the language being compiled.  For C (and C++), this
1      activates optimizations based on the type of expressions.  In
1      particular, an object of one type is assumed never to reside at the
1      same address as an object of a different type, unless the types are
1      almost the same.  For example, an 'unsigned int' can alias an
1      'int', but not a 'void*' or a 'double'.  A character type may alias
1      any other type.
1 
1      Pay special attention to code like this:
1           union a_union {
1             int i;
1             double d;
1           };
1 
1           int f() {
1             union a_union t;
1             t.d = 3.0;
1             return t.i;
1           }
1      The practice of reading from a different union member than the one
1      most recently written to (called "type-punning") is common.  Even
1      with '-fstrict-aliasing', type-punning is allowed, provided the
1      memory is accessed through the union type.  So, the code above
11      works as expected.  ⇒Structures unions enumerations and
      bit-fields implementation.  However, this code might not:
1           int f() {
1             union a_union t;
1             int* ip;
1             t.d = 3.0;
1             ip = &t.i;
1             return *ip;
1           }
1 
1      Similarly, access by taking the address, casting the resulting
1      pointer and dereferencing the result has undefined behavior, even
1      if the cast uses a union type, e.g.:
1           int f() {
1             double d = 3.0;
1             return ((union a_union *) &d)->i;
1           }
1 
1      The '-fstrict-aliasing' option is enabled at levels '-O2', '-O3',
1      '-Os'.
1 
1 '-falign-functions'
1 '-falign-functions=N'
1      Align the start of functions to the next power-of-two greater than
1      N, skipping up to N bytes.  For instance, '-falign-functions=32'
1      aligns functions to the next 32-byte boundary, but
1      '-falign-functions=24' aligns to the next 32-byte boundary only if
1      this can be done by skipping 23 bytes or less.
1 
1      '-fno-align-functions' and '-falign-functions=1' are equivalent and
1      mean that functions are not aligned.
1 
1      Some assemblers only support this flag when N is a power of two; in
1      that case, it is rounded up.
1 
1      If N is not specified or is zero, use a machine-dependent default.
1      The maximum allowed N option value is 65536.
1 
1      Enabled at levels '-O2', '-O3'.
1 
1 '-flimit-function-alignment'
1      If this option is enabled, the compiler tries to avoid
1      unnecessarily overaligning functions.  It attempts to instruct the
1      assembler to align by the amount specified by '-falign-functions',
1      but not to skip more bytes than the size of the function.
1 
1 '-falign-labels'
1 '-falign-labels=N'
1      Align all branch targets to a power-of-two boundary, skipping up to
1      N bytes like '-falign-functions'.  This option can easily make code
1      slower, because it must insert dummy operations for when the branch
1      target is reached in the usual flow of the code.
1 
1      '-fno-align-labels' and '-falign-labels=1' are equivalent and mean
1      that labels are not aligned.
1 
1      If '-falign-loops' or '-falign-jumps' are applicable and are
1      greater than this value, then their values are used instead.
1 
1      If N is not specified or is zero, use a machine-dependent default
1      which is very likely to be '1', meaning no alignment.  The maximum
1      allowed N option value is 65536.
1 
1      Enabled at levels '-O2', '-O3'.
1 
1 '-falign-loops'
1 '-falign-loops=N'
1      Align loops to a power-of-two boundary, skipping up to N bytes like
1      '-falign-functions'.  If the loops are executed many times, this
1      makes up for any execution of the dummy operations.
1 
1      '-fno-align-loops' and '-falign-loops=1' are equivalent and mean
1      that loops are not aligned.  The maximum allowed N option value is
1      65536.
1 
1      If N is not specified or is zero, use a machine-dependent default.
1 
1      Enabled at levels '-O2', '-O3'.
1 
1 '-falign-jumps'
1 '-falign-jumps=N'
1      Align branch targets to a power-of-two boundary, for branch targets
1      where the targets can only be reached by jumping, skipping up to N
1      bytes like '-falign-functions'.  In this case, no dummy operations
1      need be executed.
1 
1      '-fno-align-jumps' and '-falign-jumps=1' are equivalent and mean
1      that loops are not aligned.
1 
1      If N is not specified or is zero, use a machine-dependent default.
1      The maximum allowed N option value is 65536.
1 
1      Enabled at levels '-O2', '-O3'.
1 
1 '-funit-at-a-time'
1      This option is left for compatibility reasons.  '-funit-at-a-time'
1      has no effect, while '-fno-unit-at-a-time' implies
1      '-fno-toplevel-reorder' and '-fno-section-anchors'.
1 
1      Enabled by default.
1 
1 '-fno-toplevel-reorder'
1      Do not reorder top-level functions, variables, and 'asm'
1      statements.  Output them in the same order that they appear in the
1      input file.  When this option is used, unreferenced static
1      variables are not removed.  This option is intended to support
1      existing code that relies on a particular ordering.  For new code,
1      it is better to use attributes when possible.
1 
1      Enabled at level '-O0'.  When disabled explicitly, it also implies
1      '-fno-section-anchors', which is otherwise enabled at '-O0' on some
1      targets.
1 
1 '-fweb'
1      Constructs webs as commonly used for register allocation purposes
1      and assign each web individual pseudo register.  This allows the
1      register allocation pass to operate on pseudos directly, but also
1      strengthens several other optimization passes, such as CSE, loop
1      optimizer and trivial dead code remover.  It can, however, make
1      debugging impossible, since variables no longer stay in a "home
1      register".
1 
1      Enabled by default with '-funroll-loops'.
1 
1 '-fwhole-program'
1      Assume that the current compilation unit represents the whole
1      program being compiled.  All public functions and variables with
1      the exception of 'main' and those merged by attribute
1      'externally_visible' become static functions and in effect are
1      optimized more aggressively by interprocedural optimizers.
1 
1      This option should not be used in combination with '-flto'.
1      Instead relying on a linker plugin should provide safer and more
1      precise information.
1 
1 '-flto[=N]'
1      This option runs the standard link-time optimizer.  When invoked
1      with source code, it generates GIMPLE (one of GCC's internal
1      representations) and writes it to special ELF sections in the
1      object file.  When the object files are linked together, all the
1      function bodies are read from these ELF sections and instantiated
1      as if they had been part of the same translation unit.
1 
1      To use the link-time optimizer, '-flto' and optimization options
1      should be specified at compile time and during the final link.  It
1      is recommended that you compile all the files participating in the
1      same link with the same options and also specify those options at
1      link time.  For example:
1 
1           gcc -c -O2 -flto foo.c
1           gcc -c -O2 -flto bar.c
1           gcc -o myprog -flto -O2 foo.o bar.o
1 
1      The first two invocations to GCC save a bytecode representation of
1      GIMPLE into special ELF sections inside 'foo.o' and 'bar.o'.  The
1      final invocation reads the GIMPLE bytecode from 'foo.o' and
1      'bar.o', merges the two files into a single internal image, and
1      compiles the result as usual.  Since both 'foo.o' and 'bar.o' are
1      merged into a single image, this causes all the interprocedural
1      analyses and optimizations in GCC to work across the two files as
1      if they were a single one.  This means, for example, that the
1      inliner is able to inline functions in 'bar.o' into functions in
1      'foo.o' and vice-versa.
1 
1      Another (simpler) way to enable link-time optimization is:
1 
1           gcc -o myprog -flto -O2 foo.c bar.c
1 
1      The above generates bytecode for 'foo.c' and 'bar.c', merges them
1      together into a single GIMPLE representation and optimizes them as
1      usual to produce 'myprog'.
1 
1      The only important thing to keep in mind is that to enable
1      link-time optimizations you need to use the GCC driver to perform
1      the link step.  GCC then automatically performs link-time
1      optimization if any of the objects involved were compiled with the
1      '-flto' command-line option.  You generally should specify the
1      optimization options to be used for link-time optimization though
1      GCC tries to be clever at guessing an optimization level to use
1      from the options used at compile time if you fail to specify one at
1      link time.  You can always override the automatic decision to do
1      link-time optimization by passing '-fno-lto' to the link command.
1 
1      To make whole program optimization effective, it is necessary to
1      make certain whole program assumptions.  The compiler needs to know
1      what functions and variables can be accessed by libraries and
1      runtime outside of the link-time optimized unit.  When supported by
1      the linker, the linker plugin (see '-fuse-linker-plugin') passes
1      information to the compiler about used and externally visible
1      symbols.  When the linker plugin is not available,
1      '-fwhole-program' should be used to allow the compiler to make
1      these assumptions, which leads to more aggressive optimization
1      decisions.
1 
1      When '-fuse-linker-plugin' is not enabled, when a file is compiled
1      with '-flto', the generated object file is larger than a regular
1      object file because it contains GIMPLE bytecodes and the usual
1      final code (see '-ffat-lto-objects'.  This means that object files
1      with LTO information can be linked as normal object files; if
1      '-fno-lto' is passed to the linker, no interprocedural
1      optimizations are applied.  Note that when '-fno-fat-lto-objects'
1      is enabled the compile stage is faster but you cannot perform a
1      regular, non-LTO link on them.
1 
1      Additionally, the optimization flags used to compile individual
1      files are not necessarily related to those used at link time.  For
1      instance,
1 
1           gcc -c -O0 -ffat-lto-objects -flto foo.c
1           gcc -c -O0 -ffat-lto-objects -flto bar.c
1           gcc -o myprog -O3 foo.o bar.o
1 
1      This produces individual object files with unoptimized assembler
1      code, but the resulting binary 'myprog' is optimized at '-O3'.  If,
1      instead, the final binary is generated with '-fno-lto', then
1      'myprog' is not optimized.
1 
1      When producing the final binary, GCC only applies link-time
1      optimizations to those files that contain bytecode.  Therefore, you
1      can mix and match object files and libraries with GIMPLE bytecodes
1      and final object code.  GCC automatically selects which files to
1      optimize in LTO mode and which files to link without further
1      processing.
1 
1      There are some code generation flags preserved by GCC when
1      generating bytecodes, as they need to be used during the final link
1      stage.  Generally options specified at link time override those
1      specified at compile time.
1 
1      If you do not specify an optimization level option '-O' at link
1      time, then GCC uses the highest optimization level used when
1      compiling the object files.
1 
1      Currently, the following options and their settings are taken from
1      the first object file that explicitly specifies them: '-fPIC',
1      '-fpic', '-fpie', '-fcommon', '-fexceptions',
1      '-fnon-call-exceptions', '-fgnu-tm' and all the '-m' target flags.
1 
1      Certain ABI-changing flags are required to match in all compilation
1      units, and trying to override this at link time with a conflicting
1      value is ignored.  This includes options such as
1      '-freg-struct-return' and '-fpcc-struct-return'.
1 
1      Other options such as '-ffp-contract', '-fno-strict-overflow',
1      '-fwrapv', '-fno-trapv' or '-fno-strict-aliasing' are passed
1      through to the link stage and merged conservatively for conflicting
1      translation units.  Specifically '-fno-strict-overflow', '-fwrapv'
1      and '-fno-trapv' take precedence; and for example
1      '-ffp-contract=off' takes precedence over '-ffp-contract=fast'.
1      You can override them at link time.
1 
1      If LTO encounters objects with C linkage declared with incompatible
1      types in separate translation units to be linked together
1      (undefined behavior according to ISO C99 6.2.7), a non-fatal
1      diagnostic may be issued.  The behavior is still undefined at run
1      time.  Similar diagnostics may be raised for other languages.
1 
1      Another feature of LTO is that it is possible to apply
1      interprocedural optimizations on files written in different
1      languages:
1 
1           gcc -c -flto foo.c
1           g++ -c -flto bar.cc
1           gfortran -c -flto baz.f90
1           g++ -o myprog -flto -O3 foo.o bar.o baz.o -lgfortran
1 
1      Notice that the final link is done with 'g++' to get the C++
1      runtime libraries and '-lgfortran' is added to get the Fortran
1      runtime libraries.  In general, when mixing languages in LTO mode,
1      you should use the same link command options as when mixing
1      languages in a regular (non-LTO) compilation.
1 
1      If object files containing GIMPLE bytecode are stored in a library
1      archive, say 'libfoo.a', it is possible to extract and use them in
1      an LTO link if you are using a linker with plugin support.  To
1      create static libraries suitable for LTO, use 'gcc-ar' and
1      'gcc-ranlib' instead of 'ar' and 'ranlib'; to show the symbols of
1      object files with GIMPLE bytecode, use 'gcc-nm'.  Those commands
1      require that 'ar', 'ranlib' and 'nm' have been compiled with plugin
1      support.  At link time, use the flag '-fuse-linker-plugin' to
1      ensure that the library participates in the LTO optimization
1      process:
1 
1           gcc -o myprog -O2 -flto -fuse-linker-plugin a.o b.o -lfoo
1 
1      With the linker plugin enabled, the linker extracts the needed
1      GIMPLE files from 'libfoo.a' and passes them on to the running GCC
1      to make them part of the aggregated GIMPLE image to be optimized.
1 
1      If you are not using a linker with plugin support and/or do not
1      enable the linker plugin, then the objects inside 'libfoo.a' are
1      extracted and linked as usual, but they do not participate in the
1      LTO optimization process.  In order to make a static library
1      suitable for both LTO optimization and usual linkage, compile its
1      object files with '-flto' '-ffat-lto-objects'.
1 
1      Link-time optimizations do not require the presence of the whole
1      program to operate.  If the program does not require any symbols to
1      be exported, it is possible to combine '-flto' and
1      '-fwhole-program' to allow the interprocedural optimizers to use
1      more aggressive assumptions which may lead to improved optimization
1      opportunities.  Use of '-fwhole-program' is not needed when linker
1      plugin is active (see '-fuse-linker-plugin').
1 
1      The current implementation of LTO makes no attempt to generate
1      bytecode that is portable between different types of hosts.  The
1      bytecode files are versioned and there is a strict version check,
1      so bytecode files generated in one version of GCC do not work with
1      an older or newer version of GCC.
1 
1      Link-time optimization does not work well with generation of
1      debugging information on systems other than those using a
1      combination of ELF and DWARF.
1 
1      If you specify the optional N, the optimization and code generation
1      done at link time is executed in parallel using N parallel jobs by
1      utilizing an installed 'make' program.  The environment variable
1      'MAKE' may be used to override the program used.  The default value
1      for N is 1.
1 
1      You can also specify '-flto=jobserver' to use GNU make's job server
1      mode to determine the number of parallel jobs.  This is useful when
1      the Makefile calling GCC is already executing in parallel.  You
1      must prepend a '+' to the command recipe in the parent Makefile for
1      this to work.  This option likely only works if 'MAKE' is GNU make.
1 
1 '-flto-partition=ALG'
1      Specify the partitioning algorithm used by the link-time optimizer.
1      The value is either '1to1' to specify a partitioning mirroring the
1      original source files or 'balanced' to specify partitioning into
1      equally sized chunks (whenever possible) or 'max' to create new
1      partition for every symbol where possible.  Specifying 'none' as an
1      algorithm disables partitioning and streaming completely.  The
1      default value is 'balanced'.  While '1to1' can be used as an
1      workaround for various code ordering issues, the 'max' partitioning
1      is intended for internal testing only.  The value 'one' specifies
1      that exactly one partition should be used while the value 'none'
1      bypasses partitioning and executes the link-time optimization step
1      directly from the WPA phase.
1 
1 '-flto-odr-type-merging'
1      Enable streaming of mangled types names of C++ types and their
1      unification at link time.  This increases size of LTO object files,
1      but enables diagnostics about One Definition Rule violations.
1 
1 '-flto-compression-level=N'
1      This option specifies the level of compression used for
1      intermediate language written to LTO object files, and is only
1      meaningful in conjunction with LTO mode ('-flto').  Valid values
1      are 0 (no compression) to 9 (maximum compression).  Values outside
1      this range are clamped to either 0 or 9.  If the option is not
1      given, a default balanced compression setting is used.
1 
1 '-fuse-linker-plugin'
1      Enables the use of a linker plugin during link-time optimization.
1      This option relies on plugin support in the linker, which is
1      available in gold or in GNU ld 2.21 or newer.
1 
1      This option enables the extraction of object files with GIMPLE
1      bytecode out of library archives.  This improves the quality of
1      optimization by exposing more code to the link-time optimizer.
1      This information specifies what symbols can be accessed externally
1      (by non-LTO object or during dynamic linking).  Resulting code
1      quality improvements on binaries (and shared libraries that use
1      hidden visibility) are similar to '-fwhole-program'.  See '-flto'
1      for a description of the effect of this flag and how to use it.
1 
1      This option is enabled by default when LTO support in GCC is
1      enabled and GCC was configured for use with a linker supporting
1      plugins (GNU ld 2.21 or newer or gold).
1 
1 '-ffat-lto-objects'
1      Fat LTO objects are object files that contain both the intermediate
1      language and the object code.  This makes them usable for both LTO
1      linking and normal linking.  This option is effective only when
1      compiling with '-flto' and is ignored at link time.
1 
1      '-fno-fat-lto-objects' improves compilation time over plain LTO,
1      but requires the complete toolchain to be aware of LTO. It requires
1      a linker with linker plugin support for basic functionality.
1      Additionally, 'nm', 'ar' and 'ranlib' need to support linker
1      plugins to allow a full-featured build environment (capable of
1      building static libraries etc).  GCC provides the 'gcc-ar',
1      'gcc-nm', 'gcc-ranlib' wrappers to pass the right options to these
1      tools.  With non fat LTO makefiles need to be modified to use them.
1 
1      Note that modern binutils provide plugin auto-load mechanism.
1      Installing the linker plugin into '$libdir/bfd-plugins' has the
1      same effect as usage of the command wrappers ('gcc-ar', 'gcc-nm'
1      and 'gcc-ranlib').
1 
1      The default is '-fno-fat-lto-objects' on targets with linker plugin
1      support.
1 
1 '-fcompare-elim'
1      After register allocation and post-register allocation instruction
1      splitting, identify arithmetic instructions that compute processor
1      flags similar to a comparison operation based on that arithmetic.
1      If possible, eliminate the explicit comparison operation.
1 
1      This pass only applies to certain targets that cannot explicitly
1      represent the comparison operation before register allocation is
1      complete.
1 
1      Enabled at levels '-O', '-O2', '-O3', '-Os'.
1 
1 '-fcprop-registers'
1      After register allocation and post-register allocation instruction
1      splitting, perform a copy-propagation pass to try to reduce
1      scheduling dependencies and occasionally eliminate the copy.
1 
1      Enabled at levels '-O', '-O2', '-O3', '-Os'.
1 
1 '-fprofile-correction'
1      Profiles collected using an instrumented binary for multi-threaded
1      programs may be inconsistent due to missed counter updates.  When
1      this option is specified, GCC uses heuristics to correct or smooth
1      out such inconsistencies.  By default, GCC emits an error message
1      when an inconsistent profile is detected.
1 
1 '-fprofile-use'
1 '-fprofile-use=PATH'
1      Enable profile feedback-directed optimizations, and the following
1      optimizations which are generally profitable only with profile
1      feedback available: '-fbranch-probabilities', '-fvpt',
1      '-funroll-loops', '-fpeel-loops', '-ftracer', '-ftree-vectorize',
1      and 'ftree-loop-distribute-patterns'.
1 
1      Before you can use this option, you must first generate profiling
1      information.  ⇒Instrumentation Options, for information
1      about the '-fprofile-generate' option.
1 
1      By default, GCC emits an error message if the feedback profiles do
1      not match the source code.  This error can be turned into a warning
1      by using '-Wcoverage-mismatch'.  Note this may result in poorly
1      optimized code.
1 
1      If PATH is specified, GCC looks at the PATH to find the profile
1      feedback data files.  See '-fprofile-dir'.
1 
1 '-fauto-profile'
1 '-fauto-profile=PATH'
1      Enable sampling-based feedback-directed optimizations, and the
1      following optimizations which are generally profitable only with
1      profile feedback available: '-fbranch-probabilities', '-fvpt',
1      '-funroll-loops', '-fpeel-loops', '-ftracer', '-ftree-vectorize',
1      '-finline-functions', '-fipa-cp', '-fipa-cp-clone',
1      '-fpredictive-commoning', '-funswitch-loops',
1      '-fgcse-after-reload', and '-ftree-loop-distribute-patterns'.
1 
1      PATH is the name of a file containing AutoFDO profile information.
1      If omitted, it defaults to 'fbdata.afdo' in the current directory.
1 
1      Producing an AutoFDO profile data file requires running your
1      program with the 'perf' utility on a supported GNU/Linux target
1      system.  For more information, see <https://perf.wiki.kernel.org/>.
1 
1      E.g.
1           perf record -e br_inst_retired:near_taken -b -o perf.data \
1               -- your_program
1 
1      Then use the 'create_gcov' tool to convert the raw profile data to
1      a format that can be used by GCC.  You must also supply the
1      unstripped binary for your program to this tool.  See
1      <https://github.com/google/autofdo>.
1 
1      E.g.
1           create_gcov --binary=your_program.unstripped --profile=perf.data \
1               --gcov=profile.afdo
1 
1  The following options control compiler behavior regarding
1 floating-point arithmetic.  These options trade off between speed and
1 correctness.  All must be specifically enabled.
1 
1 '-ffloat-store'
1      Do not store floating-point variables in registers, and inhibit
1      other options that might change whether a floating-point value is
1      taken from a register or memory.
1 
1      This option prevents undesirable excess precision on machines such
1      as the 68000 where the floating registers (of the 68881) keep more
1      precision than a 'double' is supposed to have.  Similarly for the
1      x86 architecture.  For most programs, the excess precision does
1      only good, but a few programs rely on the precise definition of
1      IEEE floating point.  Use '-ffloat-store' for such programs, after
1      modifying them to store all pertinent intermediate computations
1      into variables.
1 
1 '-fexcess-precision=STYLE'
1      This option allows further control over excess precision on
1      machines where floating-point operations occur in a format with
1      more precision or range than the IEEE standard and interchange
1      floating-point types.  By default, '-fexcess-precision=fast' is in
1      effect; this means that operations may be carried out in a wider
1      precision than the types specified in the source if that would
1      result in faster code, and it is unpredictable when rounding to the
1      types specified in the source code takes place.  When compiling C,
1      if '-fexcess-precision=standard' is specified then excess precision
1      follows the rules specified in ISO C99; in particular, both casts
1      and assignments cause values to be rounded to their semantic types
1      (whereas '-ffloat-store' only affects assignments).  This option is
1      enabled by default for C if a strict conformance option such as
1      '-std=c99' is used.  '-ffast-math' enables
1      '-fexcess-precision=fast' by default regardless of whether a strict
1      conformance option is used.
1 
1      '-fexcess-precision=standard' is not implemented for languages
1      other than C. On the x86, it has no effect if '-mfpmath=sse' or
1      '-mfpmath=sse+387' is specified; in the former case, IEEE semantics
1      apply without excess precision, and in the latter, rounding is
1      unpredictable.
1 
1 '-ffast-math'
1      Sets the options '-fno-math-errno', '-funsafe-math-optimizations',
1      '-ffinite-math-only', '-fno-rounding-math', '-fno-signaling-nans',
1      '-fcx-limited-range' and '-fexcess-precision=fast'.
1 
1      This option causes the preprocessor macro '__FAST_MATH__' to be
1      defined.
1 
1      This option is not turned on by any '-O' option besides '-Ofast'
1      since it can result in incorrect output for programs that depend on
1      an exact implementation of IEEE or ISO rules/specifications for
1      math functions.  It may, however, yield faster code for programs
1      that do not require the guarantees of these specifications.
1 
1 '-fno-math-errno'
1      Do not set 'errno' after calling math functions that are executed
1      with a single instruction, e.g., 'sqrt'.  A program that relies on
1      IEEE exceptions for math error handling may want to use this flag
1      for speed while maintaining IEEE arithmetic compatibility.
1 
1      This option is not turned on by any '-O' option since it can result
1      in incorrect output for programs that depend on an exact
1      implementation of IEEE or ISO rules/specifications for math
1      functions.  It may, however, yield faster code for programs that do
1      not require the guarantees of these specifications.
1 
1      The default is '-fmath-errno'.
1 
1      On Darwin systems, the math library never sets 'errno'.  There is
1      therefore no reason for the compiler to consider the possibility
1      that it might, and '-fno-math-errno' is the default.
1 
1 '-funsafe-math-optimizations'
1 
1      Allow optimizations for floating-point arithmetic that (a) assume
1      that arguments and results are valid and (b) may violate IEEE or
1      ANSI standards.  When used at link time, it may include libraries
1      or startup files that change the default FPU control word or other
1      similar optimizations.
1 
1      This option is not turned on by any '-O' option since it can result
1      in incorrect output for programs that depend on an exact
1      implementation of IEEE or ISO rules/specifications for math
1      functions.  It may, however, yield faster code for programs that do
1      not require the guarantees of these specifications.  Enables
1      '-fno-signed-zeros', '-fno-trapping-math', '-fassociative-math' and
1      '-freciprocal-math'.
1 
1      The default is '-fno-unsafe-math-optimizations'.
1 
1 '-fassociative-math'
1 
1      Allow re-association of operands in series of floating-point
1      operations.  This violates the ISO C and C++ language standard by
1      possibly changing computation result.  NOTE: re-ordering may change
1      the sign of zero as well as ignore NaNs and inhibit or create
1      underflow or overflow (and thus cannot be used on code that relies
1      on rounding behavior like '(x + 2**52) - 2**52'.  May also reorder
1      floating-point comparisons and thus may not be used when ordered
1      comparisons are required.  This option requires that both
1      '-fno-signed-zeros' and '-fno-trapping-math' be in effect.
1      Moreover, it doesn't make much sense with '-frounding-math'.  For
1      Fortran the option is automatically enabled when both
1      '-fno-signed-zeros' and '-fno-trapping-math' are in effect.
1 
1      The default is '-fno-associative-math'.
1 
1 '-freciprocal-math'
1 
1      Allow the reciprocal of a value to be used instead of dividing by
1      the value if this enables optimizations.  For example 'x / y' can
1      be replaced with 'x * (1/y)', which is useful if '(1/y)' is subject
1      to common subexpression elimination.  Note that this loses
1      precision and increases the number of flops operating on the value.
1 
1      The default is '-fno-reciprocal-math'.
1 
1 '-ffinite-math-only'
1      Allow optimizations for floating-point arithmetic that assume that
1      arguments and results are not NaNs or +-Infs.
1 
1      This option is not turned on by any '-O' option since it can result
1      in incorrect output for programs that depend on an exact
1      implementation of IEEE or ISO rules/specifications for math
1      functions.  It may, however, yield faster code for programs that do
1      not require the guarantees of these specifications.
1 
1      The default is '-fno-finite-math-only'.
1 
1 '-fno-signed-zeros'
1      Allow optimizations for floating-point arithmetic that ignore the
1      signedness of zero.  IEEE arithmetic specifies the behavior of
1      distinct +0.0 and -0.0 values, which then prohibits simplification
1      of expressions such as x+0.0 or 0.0*x (even with
1      '-ffinite-math-only').  This option implies that the sign of a zero
1      result isn't significant.
1 
1      The default is '-fsigned-zeros'.
1 
1 '-fno-trapping-math'
1      Compile code assuming that floating-point operations cannot
1      generate user-visible traps.  These traps include division by zero,
1      overflow, underflow, inexact result and invalid operation.  This
1      option requires that '-fno-signaling-nans' be in effect.  Setting
1      this option may allow faster code if one relies on "non-stop" IEEE
1      arithmetic, for example.
1 
1      This option should never be turned on by any '-O' option since it
1      can result in incorrect output for programs that depend on an exact
1      implementation of IEEE or ISO rules/specifications for math
1      functions.
1 
1      The default is '-ftrapping-math'.
1 
1 '-frounding-math'
1      Disable transformations and optimizations that assume default
1      floating-point rounding behavior.  This is round-to-zero for all
1      floating point to integer conversions, and round-to-nearest for all
1      other arithmetic truncations.  This option should be specified for
1      programs that change the FP rounding mode dynamically, or that may
1      be executed with a non-default rounding mode.  This option disables
1      constant folding of floating-point expressions at compile time
1      (which may be affected by rounding mode) and arithmetic
1      transformations that are unsafe in the presence of sign-dependent
1      rounding modes.
1 
1      The default is '-fno-rounding-math'.
1 
1      This option is experimental and does not currently guarantee to
1      disable all GCC optimizations that are affected by rounding mode.
1      Future versions of GCC may provide finer control of this setting
1      using C99's 'FENV_ACCESS' pragma.  This command-line option will be
1      used to specify the default state for 'FENV_ACCESS'.
1 
1 '-fsignaling-nans'
1      Compile code assuming that IEEE signaling NaNs may generate
1      user-visible traps during floating-point operations.  Setting this
1      option disables optimizations that may change the number of
1      exceptions visible with signaling NaNs.  This option implies
1      '-ftrapping-math'.
1 
1      This option causes the preprocessor macro '__SUPPORT_SNAN__' to be
1      defined.
1 
1      The default is '-fno-signaling-nans'.
1 
1      This option is experimental and does not currently guarantee to
1      disable all GCC optimizations that affect signaling NaN behavior.
1 
1 '-fno-fp-int-builtin-inexact'
1      Do not allow the built-in functions 'ceil', 'floor', 'round' and
1      'trunc', and their 'float' and 'long double' variants, to generate
1      code that raises the "inexact" floating-point exception for
1      noninteger arguments.  ISO C99 and C11 allow these functions to
1      raise the "inexact" exception, but ISO/IEC TS 18661-1:2014, the C
1      bindings to IEEE 754-2008, does not allow these functions to do so.
1 
1      The default is '-ffp-int-builtin-inexact', allowing the exception
1      to be raised.  This option does nothing unless '-ftrapping-math' is
1      in effect.
1 
1      Even if '-fno-fp-int-builtin-inexact' is used, if the functions
1      generate a call to a library function then the "inexact" exception
1      may be raised if the library implementation does not follow TS
1      18661.
1 
1 '-fsingle-precision-constant'
1      Treat floating-point constants as single precision instead of
1      implicitly converting them to double-precision constants.
1 
1 '-fcx-limited-range'
1      When enabled, this option states that a range reduction step is not
1      needed when performing complex division.  Also, there is no
1      checking whether the result of a complex multiplication or division
1      is 'NaN + I*NaN', with an attempt to rescue the situation in that
1      case.  The default is '-fno-cx-limited-range', but is enabled by
1      '-ffast-math'.
1 
1      This option controls the default setting of the ISO C99
1      'CX_LIMITED_RANGE' pragma.  Nevertheless, the option applies to all
1      languages.
1 
1 '-fcx-fortran-rules'
1      Complex multiplication and division follow Fortran rules.  Range
1      reduction is done as part of complex division, but there is no
1      checking whether the result of a complex multiplication or division
1      is 'NaN + I*NaN', with an attempt to rescue the situation in that
1      case.
1 
1      The default is '-fno-cx-fortran-rules'.
1 
1  The following options control optimizations that may improve
1 performance, but are not enabled by any '-O' options.  This section
1 includes experimental options that may produce broken code.
1 
1 '-fbranch-probabilities'
11      After running a program compiled with '-fprofile-arcs' (⇒
      Instrumentation Options), you can compile it a second time using
1      '-fbranch-probabilities', to improve optimizations based on the
1      number of times each branch was taken.  When a program compiled
1      with '-fprofile-arcs' exits, it saves arc execution counts to a
1      file called 'SOURCENAME.gcda' for each source file.  The
1      information in this data file is very dependent on the structure of
1      the generated code, so you must use the same source code and the
1      same optimization options for both compilations.
1 
1      With '-fbranch-probabilities', GCC puts a 'REG_BR_PROB' note on
1      each 'JUMP_INSN' and 'CALL_INSN'.  These can be used to improve
1      optimization.  Currently, they are only used in one place: in
1      'reorg.c', instead of guessing which path a branch is most likely
1      to take, the 'REG_BR_PROB' values are used to exactly determine
1      which path is taken more often.
1 
1 '-fprofile-values'
1      If combined with '-fprofile-arcs', it adds code so that some data
1      about values of expressions in the program is gathered.
1 
1      With '-fbranch-probabilities', it reads back the data gathered from
1      profiling values of expressions for usage in optimizations.
1 
1      Enabled with '-fprofile-generate' and '-fprofile-use'.
1 
1 '-fprofile-reorder-functions'
1      Function reordering based on profile instrumentation collects first
1      time of execution of a function and orders these functions in
1      ascending order.
1 
1      Enabled with '-fprofile-use'.
1 
1 '-fvpt'
1      If combined with '-fprofile-arcs', this option instructs the
1      compiler to add code to gather information about values of
1      expressions.
1 
1      With '-fbranch-probabilities', it reads back the data gathered and
1      actually performs the optimizations based on them.  Currently the
1      optimizations include specialization of division operations using
1      the knowledge about the value of the denominator.
1 
1 '-frename-registers'
1      Attempt to avoid false dependencies in scheduled code by making use
1      of registers left over after register allocation.  This
1      optimization most benefits processors with lots of registers.
1      Depending on the debug information format adopted by the target,
1      however, it can make debugging impossible, since variables no
1      longer stay in a "home register".
1 
1      Enabled by default with '-funroll-loops'.
1 
1 '-fschedule-fusion'
1      Performs a target dependent pass over the instruction stream to
1      schedule instructions of same type together because target machine
1      can execute them more efficiently if they are adjacent to each
1      other in the instruction flow.
1 
1      Enabled at levels '-O2', '-O3', '-Os'.
1 
1 '-ftracer'
1      Perform tail duplication to enlarge superblock size.  This
1      transformation simplifies the control flow of the function allowing
1      other optimizations to do a better job.
1 
1      Enabled with '-fprofile-use'.
1 
1 '-funroll-loops'
1      Unroll loops whose number of iterations can be determined at
1      compile time or upon entry to the loop.  '-funroll-loops' implies
1      '-frerun-cse-after-loop', '-fweb' and '-frename-registers'.  It
1      also turns on complete loop peeling (i.e. complete removal of loops
1      with a small constant number of iterations).  This option makes
1      code larger, and may or may not make it run faster.
1 
1      Enabled with '-fprofile-use'.
1 
1 '-funroll-all-loops'
1      Unroll all loops, even if their number of iterations is uncertain
1      when the loop is entered.  This usually makes programs run more
1      slowly.  '-funroll-all-loops' implies the same options as
1      '-funroll-loops'.
1 
1 '-fpeel-loops'
1      Peels loops for which there is enough information that they do not
1      roll much (from profile feedback or static analysis).  It also
1      turns on complete loop peeling (i.e. complete removal of loops with
1      small constant number of iterations).
1 
1      Enabled with '-O3' and/or '-fprofile-use'.
1 
1 '-fmove-loop-invariants'
1      Enables the loop invariant motion pass in the RTL loop optimizer.
1      Enabled at level '-O1'
1 
1 '-fsplit-loops'
1      Split a loop into two if it contains a condition that's always true
1      for one side of the iteration space and false for the other.
1 
1 '-funswitch-loops'
1      Move branches with loop invariant conditions out of the loop, with
1      duplicates of the loop on both branches (modified according to
1      result of the condition).
1 
1 '-ffunction-sections'
1 '-fdata-sections'
1      Place each function or data item into its own section in the output
1      file if the target supports arbitrary sections.  The name of the
1      function or the name of the data item determines the section's name
1      in the output file.
1 
1      Use these options on systems where the linker can perform
1      optimizations to improve locality of reference in the instruction
1      space.  Most systems using the ELF object format have linkers with
1      such optimizations.  On AIX, the linker rearranges sections
1      (CSECTs) based on the call graph.  The performance impact varies.
1 
1      Together with a linker garbage collection (linker '--gc-sections'
1      option) these options may lead to smaller statically-linked
1      executables (after stripping).
1 
1      On ELF/DWARF systems these options do not degenerate the quality of
1      the debug information.  There could be issues with other object
1      files/debug info formats.
1 
1      Only use these options when there are significant benefits from
1      doing so.  When you specify these options, the assembler and linker
1      create larger object and executable files and are also slower.
1      These options affect code generation.  They prevent optimizations
1      by the compiler and assembler using relative locations inside a
1      translation unit since the locations are unknown until link time.
1      An example of such an optimization is relaxing calls to short call
1      instructions.
1 
1 '-fbranch-target-load-optimize'
1      Perform branch target register load optimization before prologue /
1      epilogue threading.  The use of target registers can typically be
1      exposed only during reload, thus hoisting loads out of loops and
1      doing inter-block scheduling needs a separate optimization pass.
1 
1 '-fbranch-target-load-optimize2'
1      Perform branch target register load optimization after prologue /
1      epilogue threading.
1 
1 '-fbtr-bb-exclusive'
1      When performing branch target register load optimization, don't
1      reuse branch target registers within any basic block.
1 
1 '-fstdarg-opt'
1      Optimize the prologue of variadic argument functions with respect
1      to usage of those arguments.
1 
1 '-fsection-anchors'
1      Try to reduce the number of symbolic address calculations by using
1      shared "anchor" symbols to address nearby objects.  This
1      transformation can help to reduce the number of GOT entries and GOT
1      accesses on some targets.
1 
1      For example, the implementation of the following function 'foo':
1 
1           static int a, b, c;
1           int foo (void) { return a + b + c; }
1 
1      usually calculates the addresses of all three variables, but if you
1      compile it with '-fsection-anchors', it accesses the variables from
1      a common anchor point instead.  The effect is similar to the
1      following pseudocode (which isn't valid C):
1 
1           int foo (void)
1           {
1             register int *xr = &x;
1             return xr[&a - &x] + xr[&b - &x] + xr[&c - &x];
1           }
1 
1      Not all targets support this option.
1 
1 '--param NAME=VALUE'
1      In some places, GCC uses various constants to control the amount of
1      optimization that is done.  For example, GCC does not inline
1      functions that contain more than a certain number of instructions.
1      You can control some of these constants on the command line using
1      the '--param' option.
1 
1      The names of specific parameters, and the meaning of the values,
1      are tied to the internals of the compiler, and are subject to
1      change without notice in future releases.
1 
1      In each case, the VALUE is an integer.  The allowable choices for
1      NAME are:
1 
1      'predictable-branch-outcome'
1           When branch is predicted to be taken with probability lower
1           than this threshold (in percent), then it is considered well
1           predictable.  The default is 10.
1 
1      'max-rtl-if-conversion-insns'
1           RTL if-conversion tries to remove conditional branches around
1           a block and replace them with conditionally executed
1           instructions.  This parameter gives the maximum number of
1           instructions in a block which should be considered for
1           if-conversion.  The default is 10, though the compiler will
1           also use other heuristics to decide whether if-conversion is
1           likely to be profitable.
1 
1      'max-rtl-if-conversion-predictable-cost'
1      'max-rtl-if-conversion-unpredictable-cost'
1           RTL if-conversion will try to remove conditional branches
1           around a block and replace them with conditionally executed
1           instructions.  These parameters give the maximum permissible
1           cost for the sequence that would be generated by if-conversion
1           depending on whether the branch is statically determined to be
1           predictable or not.  The units for this parameter are the same
1           as those for the GCC internal seq_cost metric.  The compiler
1           will try to provide a reasonable default for this parameter
1           using the BRANCH_COST target macro.
1 
1      'max-crossjump-edges'
1           The maximum number of incoming edges to consider for
1           cross-jumping.  The algorithm used by '-fcrossjumping' is
1           O(N^2) in the number of edges incoming to each block.
1           Increasing values mean more aggressive optimization, making
1           the compilation time increase with probably small improvement
1           in executable size.
1 
1      'min-crossjump-insns'
1           The minimum number of instructions that must be matched at the
1           end of two blocks before cross-jumping is performed on them.
1           This value is ignored in the case where all instructions in
1           the block being cross-jumped from are matched.  The default
1           value is 5.
1 
1      'max-grow-copy-bb-insns'
1           The maximum code size expansion factor when copying basic
1           blocks instead of jumping.  The expansion is relative to a
1           jump instruction.  The default value is 8.
1 
1      'max-goto-duplication-insns'
1           The maximum number of instructions to duplicate to a block
1           that jumps to a computed goto.  To avoid O(N^2) behavior in a
1           number of passes, GCC factors computed gotos early in the
1           compilation process, and unfactors them as late as possible.
1           Only computed jumps at the end of a basic blocks with no more
1           than max-goto-duplication-insns are unfactored.  The default
1           value is 8.
1 
1      'max-delay-slot-insn-search'
1           The maximum number of instructions to consider when looking
1           for an instruction to fill a delay slot.  If more than this
1           arbitrary number of instructions are searched, the time
1           savings from filling the delay slot are minimal, so stop
1           searching.  Increasing values mean more aggressive
1           optimization, making the compilation time increase with
1           probably small improvement in execution time.
1 
1      'max-delay-slot-live-search'
1           When trying to fill delay slots, the maximum number of
1           instructions to consider when searching for a block with valid
1           live register information.  Increasing this arbitrarily chosen
1           value means more aggressive optimization, increasing the
1           compilation time.  This parameter should be removed when the
1           delay slot code is rewritten to maintain the control-flow
1           graph.
1 
1      'max-gcse-memory'
1           The approximate maximum amount of memory that can be allocated
1           in order to perform the global common subexpression
1           elimination optimization.  If more memory than specified is
1           required, the optimization is not done.
1 
1      'max-gcse-insertion-ratio'
1           If the ratio of expression insertions to deletions is larger
1           than this value for any expression, then RTL PRE inserts or
1           removes the expression and thus leaves partially redundant
1           computations in the instruction stream.  The default value is
1           20.
1 
1      'max-pending-list-length'
1           The maximum number of pending dependencies scheduling allows
1           before flushing the current state and starting over.  Large
1           functions with few branches or calls can create excessively
1           large lists which needlessly consume memory and resources.
1 
1      'max-modulo-backtrack-attempts'
1           The maximum number of backtrack attempts the scheduler should
1           make when modulo scheduling a loop.  Larger values can
1           exponentially increase compilation time.
1 
1      'max-inline-insns-single'
1           Several parameters control the tree inliner used in GCC.  This
1           number sets the maximum number of instructions (counted in
1           GCC's internal representation) in a single function that the
1           tree inliner considers for inlining.  This only affects
1           functions declared inline and methods implemented in a class
1           declaration (C++).  The default value is 400.
1 
1      'max-inline-insns-auto'
1           When you use '-finline-functions' (included in '-O3'), a lot
1           of functions that would otherwise not be considered for
1           inlining by the compiler are investigated.  To those
1           functions, a different (more restrictive) limit compared to
1           functions declared inline can be applied.  The default value
1           is 30.
1 
1      'inline-min-speedup'
1           When estimated performance improvement of caller + callee
1           runtime exceeds this threshold (in percent), the function can
1           be inlined regardless of the limit on '--param
1           max-inline-insns-single' and '--param max-inline-insns-auto'.
1           The default value is 15.
1 
1      'large-function-insns'
1           The limit specifying really large functions.  For functions
1           larger than this limit after inlining, inlining is constrained
1           by '--param large-function-growth'.  This parameter is useful
1           primarily to avoid extreme compilation time caused by
1           non-linear algorithms used by the back end.  The default value
1           is 2700.
1 
1      'large-function-growth'
1           Specifies maximal growth of large function caused by inlining
1           in percents.  The default value is 100 which limits large
1           function growth to 2.0 times the original size.
1 
1      'large-unit-insns'
1           The limit specifying large translation unit.  Growth caused by
1           inlining of units larger than this limit is limited by
1           '--param inline-unit-growth'.  For small units this might be
1           too tight.  For example, consider a unit consisting of
1           function A that is inline and B that just calls A three times.
1           If B is small relative to A, the growth of unit is 300\% and
1           yet such inlining is very sane.  For very large units
1           consisting of small inlineable functions, however, the overall
1           unit growth limit is needed to avoid exponential explosion of
1           code size.  Thus for smaller units, the size is increased to
1           '--param large-unit-insns' before applying '--param
1           inline-unit-growth'.  The default is 10000.
1 
1      'inline-unit-growth'
1           Specifies maximal overall growth of the compilation unit
1           caused by inlining.  The default value is 20 which limits unit
1           growth to 1.2 times the original size.  Cold functions (either
1           marked cold via an attribute or by profile feedback) are not
1           accounted into the unit size.
1 
1      'ipcp-unit-growth'
1           Specifies maximal overall growth of the compilation unit
1           caused by interprocedural constant propagation.  The default
1           value is 10 which limits unit growth to 1.1 times the original
1           size.
1 
1      'large-stack-frame'
1           The limit specifying large stack frames.  While inlining the
1           algorithm is trying to not grow past this limit too much.  The
1           default value is 256 bytes.
1 
1      'large-stack-frame-growth'
1           Specifies maximal growth of large stack frames caused by
1           inlining in percents.  The default value is 1000 which limits
1           large stack frame growth to 11 times the original size.
1 
1      'max-inline-insns-recursive'
1      'max-inline-insns-recursive-auto'
1           Specifies the maximum number of instructions an out-of-line
1           copy of a self-recursive inline function can grow into by
1           performing recursive inlining.
1 
1           '--param max-inline-insns-recursive' applies to functions
1           declared inline.  For functions not declared inline, recursive
1           inlining happens only when '-finline-functions' (included in
1           '-O3') is enabled; '--param max-inline-insns-recursive-auto'
1           applies instead.  The default value is 450.
1 
1      'max-inline-recursive-depth'
1      'max-inline-recursive-depth-auto'
1           Specifies the maximum recursion depth used for recursive
1           inlining.
1 
1           '--param max-inline-recursive-depth' applies to functions
1           declared inline.  For functions not declared inline, recursive
1           inlining happens only when '-finline-functions' (included in
1           '-O3') is enabled; '--param max-inline-recursive-depth-auto'
1           applies instead.  The default value is 8.
1 
1      'min-inline-recursive-probability'
1           Recursive inlining is profitable only for function having deep
1           recursion in average and can hurt for function having little
1           recursion depth by increasing the prologue size or complexity
1           of function body to other optimizers.
1 
1           When profile feedback is available (see '-fprofile-generate')
1           the actual recursion depth can be guessed from the probability
1           that function recurses via a given call expression.  This
1           parameter limits inlining only to call expressions whose
1           probability exceeds the given threshold (in percents).  The
1           default value is 10.
1 
1      'early-inlining-insns'
1           Specify growth that the early inliner can make.  In effect it
1           increases the amount of inlining for code having a large
1           abstraction penalty.  The default value is 14.
1 
1      'max-early-inliner-iterations'
1           Limit of iterations of the early inliner.  This basically
1           bounds the number of nested indirect calls the early inliner
1           can resolve.  Deeper chains are still handled by late
1           inlining.
1 
1      'comdat-sharing-probability'
1           Probability (in percent) that C++ inline function with comdat
1           visibility are shared across multiple compilation units.  The
1           default value is 20.
1 
1      'profile-func-internal-id'
1           A parameter to control whether to use function internal id in
1           profile database lookup.  If the value is 0, the compiler uses
1           an id that is based on function assembler name and filename,
1           which makes old profile data more tolerant to source changes
1           such as function reordering etc.  The default value is 0.
1 
1      'min-vect-loop-bound'
1           The minimum number of iterations under which loops are not
1           vectorized when '-ftree-vectorize' is used.  The number of
1           iterations after vectorization needs to be greater than the
1           value specified by this option to allow vectorization.  The
1           default value is 0.
1 
1      'gcse-cost-distance-ratio'
1           Scaling factor in calculation of maximum distance an
1           expression can be moved by GCSE optimizations.  This is
1           currently supported only in the code hoisting pass.  The
1           bigger the ratio, the more aggressive code hoisting is with
1           simple expressions, i.e., the expressions that have cost less
1           than 'gcse-unrestricted-cost'.  Specifying 0 disables hoisting
1           of simple expressions.  The default value is 10.
1 
1      'gcse-unrestricted-cost'
1           Cost, roughly measured as the cost of a single typical machine
1           instruction, at which GCSE optimizations do not constrain the
1           distance an expression can travel.  This is currently
1           supported only in the code hoisting pass.  The lesser the
1           cost, the more aggressive code hoisting is.  Specifying 0
1           allows all expressions to travel unrestricted distances.  The
1           default value is 3.
1 
1      'max-hoist-depth'
1           The depth of search in the dominator tree for expressions to
1           hoist.  This is used to avoid quadratic behavior in hoisting
1           algorithm.  The value of 0 does not limit on the search, but
1           may slow down compilation of huge functions.  The default
1           value is 30.
1 
1      'max-tail-merge-comparisons'
1           The maximum amount of similar bbs to compare a bb with.  This
1           is used to avoid quadratic behavior in tree tail merging.  The
1           default value is 10.
1 
1      'max-tail-merge-iterations'
1           The maximum amount of iterations of the pass over the
1           function.  This is used to limit compilation time in tree tail
1           merging.  The default value is 2.
1 
1      'store-merging-allow-unaligned'
1           Allow the store merging pass to introduce unaligned stores if
1           it is legal to do so.  The default value is 1.
1 
1      'max-stores-to-merge'
1           The maximum number of stores to attempt to merge into wider
1           stores in the store merging pass.  The minimum value is 2 and
1           the default is 64.
1 
1      'max-unrolled-insns'
1           The maximum number of instructions that a loop may have to be
1           unrolled.  If a loop is unrolled, this parameter also
1           determines how many times the loop code is unrolled.
1 
1      'max-average-unrolled-insns'
1           The maximum number of instructions biased by probabilities of
1           their execution that a loop may have to be unrolled.  If a
1           loop is unrolled, this parameter also determines how many
1           times the loop code is unrolled.
1 
1      'max-unroll-times'
1           The maximum number of unrollings of a single loop.
1 
1      'max-peeled-insns'
1           The maximum number of instructions that a loop may have to be
1           peeled.  If a loop is peeled, this parameter also determines
1           how many times the loop code is peeled.
1 
1      'max-peel-times'
1           The maximum number of peelings of a single loop.
1 
1      'max-peel-branches'
1           The maximum number of branches on the hot path through the
1           peeled sequence.
1 
1      'max-completely-peeled-insns'
1           The maximum number of insns of a completely peeled loop.
1 
1      'max-completely-peel-times'
1           The maximum number of iterations of a loop to be suitable for
1           complete peeling.
1 
1      'max-completely-peel-loop-nest-depth'
1           The maximum depth of a loop nest suitable for complete
1           peeling.
1 
1      'max-unswitch-insns'
1           The maximum number of insns of an unswitched loop.
1 
1      'max-unswitch-level'
1           The maximum number of branches unswitched in a single loop.
1 
1      'max-loop-headers-insns'
1           The maximum number of insns in loop header duplicated by the
1           copy loop headers pass.
1 
1      'lim-expensive'
1           The minimum cost of an expensive expression in the loop
1           invariant motion.
1 
1      'iv-consider-all-candidates-bound'
1           Bound on number of candidates for induction variables, below
1           which all candidates are considered for each use in induction
1           variable optimizations.  If there are more candidates than
1           this, only the most relevant ones are considered to avoid
1           quadratic time complexity.
1 
1      'iv-max-considered-uses'
1           The induction variable optimizations give up on loops that
1           contain more induction variable uses.
1 
1      'iv-always-prune-cand-set-bound'
1           If the number of candidates in the set is smaller than this
1           value, always try to remove unnecessary ivs from the set when
1           adding a new one.
1 
1      'avg-loop-niter'
1           Average number of iterations of a loop.
1 
1      'dse-max-object-size'
1           Maximum size (in bytes) of objects tracked bytewise by dead
1           store elimination.  Larger values may result in larger
1           compilation times.
1 
1      'scev-max-expr-size'
1           Bound on size of expressions used in the scalar evolutions
1           analyzer.  Large expressions slow the analyzer.
1 
1      'scev-max-expr-complexity'
1           Bound on the complexity of the expressions in the scalar
1           evolutions analyzer.  Complex expressions slow the analyzer.
1 
1      'max-tree-if-conversion-phi-args'
1           Maximum number of arguments in a PHI supported by TREE if
1           conversion unless the loop is marked with simd pragma.
1 
1      'vect-max-version-for-alignment-checks'
1           The maximum number of run-time checks that can be performed
1           when doing loop versioning for alignment in the vectorizer.
1 
1      'vect-max-version-for-alias-checks'
1           The maximum number of run-time checks that can be performed
1           when doing loop versioning for alias in the vectorizer.
1 
1      'vect-max-peeling-for-alignment'
1           The maximum number of loop peels to enhance access alignment
1           for vectorizer.  Value -1 means no limit.
1 
1      'max-iterations-to-track'
1           The maximum number of iterations of a loop the brute-force
1           algorithm for analysis of the number of iterations of the loop
1           tries to evaluate.
1 
1      'hot-bb-count-ws-permille'
1           A basic block profile count is considered hot if it
1           contributes to the given permillage (i.e.  0...1000) of the
1           entire profiled execution.
1 
1      'hot-bb-frequency-fraction'
1           Select fraction of the entry block frequency of executions of
1           basic block in function given basic block needs to have to be
1           considered hot.
1 
1      'max-predicted-iterations'
1           The maximum number of loop iterations we predict statically.
1           This is useful in cases where a function contains a single
1           loop with known bound and another loop with unknown bound.
1           The known number of iterations is predicted correctly, while
1           the unknown number of iterations average to roughly 10.  This
1           means that the loop without bounds appears artificially cold
1           relative to the other one.
1 
1      'builtin-expect-probability'
1           Control the probability of the expression having the specified
1           value.  This parameter takes a percentage (i.e.  0 ...  100)
1           as input.  The default probability of 90 is obtained
1           empirically.
1 
1      'align-threshold'
1 
1           Select fraction of the maximal frequency of executions of a
1           basic block in a function to align the basic block.
1 
1      'align-loop-iterations'
1 
1           A loop expected to iterate at least the selected number of
1           iterations is aligned.
1 
1      'tracer-dynamic-coverage'
1      'tracer-dynamic-coverage-feedback'
1 
1           This value is used to limit superblock formation once the
1           given percentage of executed instructions is covered.  This
1           limits unnecessary code size expansion.
1 
1           The 'tracer-dynamic-coverage-feedback' parameter is used only
1           when profile feedback is available.  The real profiles (as
1           opposed to statically estimated ones) are much less balanced
1           allowing the threshold to be larger value.
1 
1      'tracer-max-code-growth'
1           Stop tail duplication once code growth has reached given
1           percentage.  This is a rather artificial limit, as most of the
1           duplicates are eliminated later in cross jumping, so it may be
1           set to much higher values than is the desired code growth.
1 
1      'tracer-min-branch-ratio'
1 
1           Stop reverse growth when the reverse probability of best edge
1           is less than this threshold (in percent).
1 
1      'tracer-min-branch-probability'
1      'tracer-min-branch-probability-feedback'
1 
1           Stop forward growth if the best edge has probability lower
1           than this threshold.
1 
1           Similarly to 'tracer-dynamic-coverage' two parameters are
1           provided.  'tracer-min-branch-probability-feedback' is used
1           for compilation with profile feedback and
1           'tracer-min-branch-probability' compilation without.  The
1           value for compilation with profile feedback needs to be more
1           conservative (higher) in order to make tracer effective.
1 
1      'stack-clash-protection-guard-size'
1           Specify the size of the operating system provided stack guard
1           as 2 raised to NUM bytes.  The default value is 12 (4096
1           bytes).  Acceptable values are between 12 and 30.  Higher
1           values may reduce the number of explicit probes, but a value
1           larger than the operating system provided guard will leave
1           code vulnerable to stack clash style attacks.
1 
1      'stack-clash-protection-probe-interval'
1           Stack clash protection involves probing stack space as it is
1           allocated.  This param controls the maximum distance between
1           probes into the stack as 2 raised to NUM bytes.  Acceptable
1           values are between 10 and 16 and defaults to 12.  Higher
1           values may reduce the number of explicit probes, but a value
1           larger than the operating system provided guard will leave
1           code vulnerable to stack clash style attacks.
1 
1      'max-cse-path-length'
1 
1           The maximum number of basic blocks on path that CSE considers.
1           The default is 10.
1 
1      'max-cse-insns'
1           The maximum number of instructions CSE processes before
1           flushing.  The default is 1000.
1 
1      'ggc-min-expand'
1 
1           GCC uses a garbage collector to manage its own memory
1           allocation.  This parameter specifies the minimum percentage
1           by which the garbage collector's heap should be allowed to
1           expand between collections.  Tuning this may improve
1           compilation speed; it has no effect on code generation.
1 
1           The default is 30% + 70% * (RAM/1GB) with an upper bound of
1           100% when RAM >= 1GB.  If 'getrlimit' is available, the notion
1           of "RAM" is the smallest of actual RAM and 'RLIMIT_DATA' or
1           'RLIMIT_AS'.  If GCC is not able to calculate RAM on a
1           particular platform, the lower bound of 30% is used.  Setting
1           this parameter and 'ggc-min-heapsize' to zero causes a full
1           collection to occur at every opportunity.  This is extremely
1           slow, but can be useful for debugging.
1 
1      'ggc-min-heapsize'
1 
1           Minimum size of the garbage collector's heap before it begins
1           bothering to collect garbage.  The first collection occurs
1           after the heap expands by 'ggc-min-expand'% beyond
1           'ggc-min-heapsize'.  Again, tuning this may improve
1           compilation speed, and has no effect on code generation.
1 
1           The default is the smaller of RAM/8, RLIMIT_RSS, or a limit
1           that tries to ensure that RLIMIT_DATA or RLIMIT_AS are not
1           exceeded, but with a lower bound of 4096 (four megabytes) and
1           an upper bound of 131072 (128 megabytes).  If GCC is not able
1           to calculate RAM on a particular platform, the lower bound is
1           used.  Setting this parameter very large effectively disables
1           garbage collection.  Setting this parameter and
1           'ggc-min-expand' to zero causes a full collection to occur at
1           every opportunity.
1 
1      'max-reload-search-insns'
1           The maximum number of instruction reload should look backward
1           for equivalent register.  Increasing values mean more
1           aggressive optimization, making the compilation time increase
1           with probably slightly better performance.  The default value
1           is 100.
1 
1      'max-cselib-memory-locations'
1           The maximum number of memory locations cselib should take into
1           account.  Increasing values mean more aggressive optimization,
1           making the compilation time increase with probably slightly
1           better performance.  The default value is 500.
1 
1      'max-sched-ready-insns'
1           The maximum number of instructions ready to be issued the
1           scheduler should consider at any given time during the first
1           scheduling pass.  Increasing values mean more thorough
1           searches, making the compilation time increase with probably
1           little benefit.  The default value is 100.
1 
1      'max-sched-region-blocks'
1           The maximum number of blocks in a region to be considered for
1           interblock scheduling.  The default value is 10.
1 
1      'max-pipeline-region-blocks'
1           The maximum number of blocks in a region to be considered for
1           pipelining in the selective scheduler.  The default value is
1           15.
1 
1      'max-sched-region-insns'
1           The maximum number of insns in a region to be considered for
1           interblock scheduling.  The default value is 100.
1 
1      'max-pipeline-region-insns'
1           The maximum number of insns in a region to be considered for
1           pipelining in the selective scheduler.  The default value is
1           200.
1 
1      'min-spec-prob'
1           The minimum probability (in percents) of reaching a source
1           block for interblock speculative scheduling.  The default
1           value is 40.
1 
1      'max-sched-extend-regions-iters'
1           The maximum number of iterations through CFG to extend
1           regions.  A value of 0 (the default) disables region
1           extensions.
1 
1      'max-sched-insn-conflict-delay'
1           The maximum conflict delay for an insn to be considered for
1           speculative motion.  The default value is 3.
1 
1      'sched-spec-prob-cutoff'
1           The minimal probability of speculation success (in percents),
1           so that speculative insns are scheduled.  The default value is
1           40.
1 
1      'sched-state-edge-prob-cutoff'
1           The minimum probability an edge must have for the scheduler to
1           save its state across it.  The default value is 10.
1 
1      'sched-mem-true-dep-cost'
1           Minimal distance (in CPU cycles) between store and load
1           targeting same memory locations.  The default value is 1.
1 
1      'selsched-max-lookahead'
1           The maximum size of the lookahead window of selective
1           scheduling.  It is a depth of search for available
1           instructions.  The default value is 50.
1 
1      'selsched-max-sched-times'
1           The maximum number of times that an instruction is scheduled
1           during selective scheduling.  This is the limit on the number
1           of iterations through which the instruction may be pipelined.
1           The default value is 2.
1 
1      'selsched-insns-to-rename'
1           The maximum number of best instructions in the ready list that
1           are considered for renaming in the selective scheduler.  The
1           default value is 2.
1 
1      'sms-min-sc'
1           The minimum value of stage count that swing modulo scheduler
1           generates.  The default value is 2.
1 
1      'max-last-value-rtl'
1           The maximum size measured as number of RTLs that can be
1           recorded in an expression in combiner for a pseudo register as
1           last known value of that register.  The default is 10000.
1 
1      'max-combine-insns'
1           The maximum number of instructions the RTL combiner tries to
1           combine.  The default value is 2 at '-Og' and 4 otherwise.
1 
1      'integer-share-limit'
1           Small integer constants can use a shared data structure,
1           reducing the compiler's memory usage and increasing its speed.
1           This sets the maximum value of a shared integer constant.  The
1           default value is 256.
1 
1      'ssp-buffer-size'
1           The minimum size of buffers (i.e. arrays) that receive stack
1           smashing protection when '-fstack-protection' is used.
1 
1      'min-size-for-stack-sharing'
1           The minimum size of variables taking part in stack slot
1           sharing when not optimizing.  The default value is 32.
1 
1      'max-jump-thread-duplication-stmts'
1           Maximum number of statements allowed in a block that needs to
1           be duplicated when threading jumps.
1 
1      'max-fields-for-field-sensitive'
1           Maximum number of fields in a structure treated in a field
1           sensitive manner during pointer analysis.  The default is zero
1           for '-O0' and '-O1', and 100 for '-Os', '-O2', and '-O3'.
1 
1      'prefetch-latency'
1           Estimate on average number of instructions that are executed
1           before prefetch finishes.  The distance prefetched ahead is
1           proportional to this constant.  Increasing this number may
1           also lead to less streams being prefetched (see
1           'simultaneous-prefetches').
1 
1      'simultaneous-prefetches'
1           Maximum number of prefetches that can run at the same time.
1 
1      'l1-cache-line-size'
1           The size of cache line in L1 cache, in bytes.
1 
1      'l1-cache-size'
1           The size of L1 cache, in kilobytes.
1 
1      'l2-cache-size'
1           The size of L2 cache, in kilobytes.
1 
1      'loop-interchange-max-num-stmts'
1           The maximum number of stmts in a loop to be interchanged.
1 
1      'loop-interchange-stride-ratio'
1           The minimum ratio between stride of two loops for interchange
1           to be profitable.
1 
1      'min-insn-to-prefetch-ratio'
1           The minimum ratio between the number of instructions and the
1           number of prefetches to enable prefetching in a loop.
1 
1      'prefetch-min-insn-to-mem-ratio'
1           The minimum ratio between the number of instructions and the
1           number of memory references to enable prefetching in a loop.
1 
1      'use-canonical-types'
1           Whether the compiler should use the "canonical" type system.
1           By default, this should always be 1, which uses a more
1           efficient internal mechanism for comparing types in C++ and
1           Objective-C++.  However, if bugs in the canonical type system
1           are causing compilation failures, set this value to 0 to
1           disable canonical types.
1 
1      'switch-conversion-max-branch-ratio'
1           Switch initialization conversion refuses to create arrays that
1           are bigger than 'switch-conversion-max-branch-ratio' times the
1           number of branches in the switch.
1 
1      'max-partial-antic-length'
1           Maximum length of the partial antic set computed during the
1           tree partial redundancy elimination optimization
1           ('-ftree-pre') when optimizing at '-O3' and above.  For some
1           sorts of source code the enhanced partial redundancy
1           elimination optimization can run away, consuming all of the
1           memory available on the host machine.  This parameter sets a
1           limit on the length of the sets that are computed, which
1           prevents the runaway behavior.  Setting a value of 0 for this
1           parameter allows an unlimited set length.
1 
1      'sccvn-max-scc-size'
1           Maximum size of a strongly connected component (SCC) during
1           SCCVN processing.  If this limit is hit, SCCVN processing for
1           the whole function is not done and optimizations depending on
1           it are disabled.  The default maximum SCC size is 10000.
1 
1      'sccvn-max-alias-queries-per-access'
1           Maximum number of alias-oracle queries we perform when looking
1           for redundancies for loads and stores.  If this limit is hit
1           the search is aborted and the load or store is not considered
1           redundant.  The number of queries is algorithmically limited
1           to the number of stores on all paths from the load to the
1           function entry.  The default maximum number of queries is
1           1000.
1 
1      'ira-max-loops-num'
1           IRA uses regional register allocation by default.  If a
1           function contains more loops than the number given by this
1           parameter, only at most the given number of the most
1           frequently-executed loops form regions for regional register
1           allocation.  The default value of the parameter is 100.
1 
1      'ira-max-conflict-table-size'
1           Although IRA uses a sophisticated algorithm to compress the
1           conflict table, the table can still require excessive amounts
1           of memory for huge functions.  If the conflict table for a
1           function could be more than the size in MB given by this
1           parameter, the register allocator instead uses a faster,
1           simpler, and lower-quality algorithm that does not require
1           building a pseudo-register conflict table.  The default value
1           of the parameter is 2000.
1 
1      'ira-loop-reserved-regs'
1           IRA can be used to evaluate more accurate register pressure in
1           loops for decisions to move loop invariants (see '-O3').  The
1           number of available registers reserved for some other purposes
1           is given by this parameter.  The default value of the
1           parameter is 2, which is the minimal number of registers
1           needed by typical instructions.  This value is the best found
1           from numerous experiments.
1 
1      'lra-inheritance-ebb-probability-cutoff'
1           LRA tries to reuse values reloaded in registers in subsequent
1           insns.  This optimization is called inheritance.  EBB is used
1           as a region to do this optimization.  The parameter defines a
1           minimal fall-through edge probability in percentage used to
1           add BB to inheritance EBB in LRA. The default value of the
1           parameter is 40.  The value was chosen from numerous runs of
1           SPEC2000 on x86-64.
1 
1      'loop-invariant-max-bbs-in-loop'
1           Loop invariant motion can be very expensive, both in
1           compilation time and in amount of needed compile-time memory,
1           with very large loops.  Loops with more basic blocks than this
1           parameter won't have loop invariant motion optimization
1           performed on them.  The default value of the parameter is 1000
1           for '-O1' and 10000 for '-O2' and above.
1 
1      'loop-max-datarefs-for-datadeps'
1           Building data dependencies is expensive for very large loops.
1           This parameter limits the number of data references in loops
1           that are considered for data dependence analysis.  These large
1           loops are no handled by the optimizations using loop data
1           dependencies.  The default value is 1000.
1 
1      'max-vartrack-size'
1           Sets a maximum number of hash table slots to use during
1           variable tracking dataflow analysis of any function.  If this
1           limit is exceeded with variable tracking at assignments
1           enabled, analysis for that function is retried without it,
1           after removing all debug insns from the function.  If the
1           limit is exceeded even without debug insns, var tracking
1           analysis is completely disabled for the function.  Setting the
1           parameter to zero makes it unlimited.
1 
1      'max-vartrack-expr-depth'
1           Sets a maximum number of recursion levels when attempting to
1           map variable names or debug temporaries to value expressions.
1           This trades compilation time for more complete debug
1           information.  If this is set too low, value expressions that
1           are available and could be represented in debug information
1           may end up not being used; setting this higher may enable the
1           compiler to find more complex debug expressions, but compile
1           time and memory use may grow.  The default is 12.
1 
1      'max-debug-marker-count'
1           Sets a threshold on the number of debug markers (e.g.  begin
1           stmt markers) to avoid complexity explosion at inlining or
1           expanding to RTL. If a function has more such gimple stmts
1           than the set limit, such stmts will be dropped from the
1           inlined copy of a function, and from its RTL expansion.  The
1           default is 100000.
1 
1      'min-nondebug-insn-uid'
1           Use uids starting at this parameter for nondebug insns.  The
1           range below the parameter is reserved exclusively for debug
1           insns created by '-fvar-tracking-assignments', but debug insns
1           may get (non-overlapping) uids above it if the reserved range
1           is exhausted.
1 
1      'ipa-sra-ptr-growth-factor'
1           IPA-SRA replaces a pointer to an aggregate with one or more
1           new parameters only when their cumulative size is less or
1           equal to 'ipa-sra-ptr-growth-factor' times the size of the
1           original pointer parameter.
1 
1      'sra-max-scalarization-size-Ospeed'
1      'sra-max-scalarization-size-Osize'
1           The two Scalar Reduction of Aggregates passes (SRA and
1           IPA-SRA) aim to replace scalar parts of aggregates with uses
1           of independent scalar variables.  These parameters control the
1           maximum size, in storage units, of aggregate which is
1           considered for replacement when compiling for speed
1           ('sra-max-scalarization-size-Ospeed') or size
1           ('sra-max-scalarization-size-Osize') respectively.
1 
1      'sra-max-propagations'
1           The maximum number of artificial accesses that Scalar
1           Replacement of Aggregates (SRA) will track, per one local
1           variable, in order to facilitate copy propagation.
1 
1      'tm-max-aggregate-size'
1           When making copies of thread-local variables in a transaction,
1           this parameter specifies the size in bytes after which
1           variables are saved with the logging functions as opposed to
1           save/restore code sequence pairs.  This option only applies
1           when using '-fgnu-tm'.
1 
1      'graphite-max-nb-scop-params'
1           To avoid exponential effects in the Graphite loop transforms,
1           the number of parameters in a Static Control Part (SCoP) is
1           bounded.  The default value is 10 parameters, a value of zero
1           can be used to lift the bound.  A variable whose value is
1           unknown at compilation time and defined outside a SCoP is a
1           parameter of the SCoP.
1 
1      'loop-block-tile-size'
1           Loop blocking or strip mining transforms, enabled with
1           '-floop-block' or '-floop-strip-mine', strip mine each loop in
1           the loop nest by a given number of iterations.  The strip
1           length can be changed using the 'loop-block-tile-size'
1           parameter.  The default value is 51 iterations.
1 
1      'loop-unroll-jam-size'
1           Specify the unroll factor for the '-floop-unroll-and-jam'
1           option.  The default value is 4.
1 
1      'loop-unroll-jam-depth'
1           Specify the dimension to be unrolled (counting from the most
1           inner loop) for the '-floop-unroll-and-jam'.  The default
1           value is 2.
1 
1      'ipa-cp-value-list-size'
1           IPA-CP attempts to track all possible values and types passed
1           to a function's parameter in order to propagate them and
1           perform devirtualization.  'ipa-cp-value-list-size' is the
1           maximum number of values and types it stores per one formal
1           parameter of a function.
1 
1      'ipa-cp-eval-threshold'
1           IPA-CP calculates its own score of cloning profitability
1           heuristics and performs those cloning opportunities with
1           scores that exceed 'ipa-cp-eval-threshold'.
1 
1      'ipa-cp-recursion-penalty'
1           Percentage penalty the recursive functions will receive when
1           they are evaluated for cloning.
1 
1      'ipa-cp-single-call-penalty'
1           Percentage penalty functions containing a single call to
1           another function will receive when they are evaluated for
1           cloning.
1 
1      'ipa-max-agg-items'
1           IPA-CP is also capable to propagate a number of scalar values
1           passed in an aggregate.  'ipa-max-agg-items' controls the
1           maximum number of such values per one parameter.
1 
1      'ipa-cp-loop-hint-bonus'
1           When IPA-CP determines that a cloning candidate would make the
1           number of iterations of a loop known, it adds a bonus of
1           'ipa-cp-loop-hint-bonus' to the profitability score of the
1           candidate.
1 
1      'ipa-cp-array-index-hint-bonus'
1           When IPA-CP determines that a cloning candidate would make the
1           index of an array access known, it adds a bonus of
1           'ipa-cp-array-index-hint-bonus' to the profitability score of
1           the candidate.
1 
1      'ipa-max-aa-steps'
1           During its analysis of function bodies, IPA-CP employs alias
1           analysis in order to track values pointed to by function
1           parameters.  In order not spend too much time analyzing huge
1           functions, it gives up and consider all memory clobbered after
1           examining 'ipa-max-aa-steps' statements modifying memory.
1 
1      'lto-partitions'
1           Specify desired number of partitions produced during WHOPR
1           compilation.  The number of partitions should exceed the
1           number of CPUs used for compilation.  The default value is 32.
1 
1      'lto-min-partition'
1           Size of minimal partition for WHOPR (in estimated
1           instructions).  This prevents expenses of splitting very small
1           programs into too many partitions.
1 
1      'lto-max-partition'
1           Size of max partition for WHOPR (in estimated instructions).
1           to provide an upper bound for individual size of partition.
1           Meant to be used only with balanced partitioning.
1 
1      'cxx-max-namespaces-for-diagnostic-help'
1           The maximum number of namespaces to consult for suggestions
1           when C++ name lookup fails for an identifier.  The default is
1           1000.
1 
1      'sink-frequency-threshold'
1           The maximum relative execution frequency (in percents) of the
1           target block relative to a statement's original block to allow
1           statement sinking of a statement.  Larger numbers result in
1           more aggressive statement sinking.  The default value is 75.
1           A small positive adjustment is applied for statements with
1           memory operands as those are even more profitable so sink.
1 
1      'max-stores-to-sink'
1           The maximum number of conditional store pairs that can be
1           sunk.  Set to 0 if either vectorization ('-ftree-vectorize')
1           or if-conversion ('-ftree-loop-if-convert') is disabled.  The
1           default is 2.
1 
1      'allow-store-data-races'
1           Allow optimizers to introduce new data races on stores.  Set
1           to 1 to allow, otherwise to 0.  This option is enabled by
1           default at optimization level '-Ofast'.
1 
1      'case-values-threshold'
1           The smallest number of different values for which it is best
1           to use a jump-table instead of a tree of conditional branches.
1           If the value is 0, use the default for the machine.  The
1           default is 0.
1 
1      'tree-reassoc-width'
1           Set the maximum number of instructions executed in parallel in
1           reassociated tree.  This parameter overrides target dependent
1           heuristics used by default if has non zero value.
1 
1      'sched-pressure-algorithm'
1           Choose between the two available implementations of
1           '-fsched-pressure'.  Algorithm 1 is the original
1           implementation and is the more likely to prevent instructions
1           from being reordered.  Algorithm 2 was designed to be a
1           compromise between the relatively conservative approach taken
1           by algorithm 1 and the rather aggressive approach taken by the
1           default scheduler.  It relies more heavily on having a regular
1           register file and accurate register pressure classes.  See
1           'haifa-sched.c' in the GCC sources for more details.
1 
1           The default choice depends on the target.
1 
1      'max-slsr-cand-scan'
1           Set the maximum number of existing candidates that are
1           considered when seeking a basis for a new straight-line
1           strength reduction candidate.
1 
1      'asan-globals'
1           Enable buffer overflow detection for global objects.  This
1           kind of protection is enabled by default if you are using
1           '-fsanitize=address' option.  To disable global objects
1           protection use '--param asan-globals=0'.
1 
1      'asan-stack'
1           Enable buffer overflow detection for stack objects.  This kind
1           of protection is enabled by default when using
1           '-fsanitize=address'.  To disable stack protection use
1           '--param asan-stack=0' option.
1 
1      'asan-instrument-reads'
1           Enable buffer overflow detection for memory reads.  This kind
1           of protection is enabled by default when using
1           '-fsanitize=address'.  To disable memory reads protection use
1           '--param asan-instrument-reads=0'.
1 
1      'asan-instrument-writes'
1           Enable buffer overflow detection for memory writes.  This kind
1           of protection is enabled by default when using
1           '-fsanitize=address'.  To disable memory writes protection use
1           '--param asan-instrument-writes=0' option.
1 
1      'asan-memintrin'
1           Enable detection for built-in functions.  This kind of
1           protection is enabled by default when using
1           '-fsanitize=address'.  To disable built-in functions
1           protection use '--param asan-memintrin=0'.
1 
1      'asan-use-after-return'
1           Enable detection of use-after-return.  This kind of protection
1           is enabled by default when using the '-fsanitize=address'
1           option.  To disable it use '--param asan-use-after-return=0'.
1 
1           Note: By default the check is disabled at run time.  To enable
1           it, add 'detect_stack_use_after_return=1' to the environment
1           variable 'ASAN_OPTIONS'.
1 
1      'asan-instrumentation-with-call-threshold'
1           If number of memory accesses in function being instrumented is
1           greater or equal to this number, use callbacks instead of
1           inline checks.  E.g.  to disable inline code use '--param
1           asan-instrumentation-with-call-threshold=0'.
1 
1      'use-after-scope-direct-emission-threshold'
1           If the size of a local variable in bytes is smaller or equal
1           to this number, directly poison (or unpoison) shadow memory
1           instead of using run-time callbacks.  The default value is
1           256.
1 
1      'chkp-max-ctor-size'
1           Static constructors generated by Pointer Bounds Checker may
1           become very large and significantly increase compile time at
1           optimization level '-O1' and higher.  This parameter is a
1           maximum number of statements in a single generated
1           constructor.  Default value is 5000.
1 
1      'max-fsm-thread-path-insns'
1           Maximum number of instructions to copy when duplicating blocks
1           on a finite state automaton jump thread path.  The default is
1           100.
1 
1      'max-fsm-thread-length'
1           Maximum number of basic blocks on a finite state automaton
1           jump thread path.  The default is 10.
1 
1      'max-fsm-thread-paths'
1           Maximum number of new jump thread paths to create for a finite
1           state automaton.  The default is 50.
1 
1      'parloops-chunk-size'
1           Chunk size of omp schedule for loops parallelized by parloops.
1           The default is 0.
1 
1      'parloops-schedule'
1           Schedule type of omp schedule for loops parallelized by
1           parloops (static, dynamic, guided, auto, runtime).  The
1           default is static.
1 
1      'parloops-min-per-thread'
1           The minimum number of iterations per thread of an innermost
1           parallelized loop for which the parallelized variant is
1           prefered over the single threaded one.  The default is 100.
1           Note that for a parallelized loop nest the minimum number of
1           iterations of the outermost loop per thread is two.
1 
1      'max-ssa-name-query-depth'
1           Maximum depth of recursion when querying properties of SSA
1           names in things like fold routines.  One level of recursion
1           corresponds to following a use-def chain.
1 
1      'hsa-gen-debug-stores'
1           Enable emission of special debug stores within HSA kernels
1           which are then read and reported by libgomp plugin.
1           Generation of these stores is disabled by default, use
1           '--param hsa-gen-debug-stores=1' to enable it.
1 
1      'max-speculative-devirt-maydefs'
1           The maximum number of may-defs we analyze when looking for a
1           must-def specifying the dynamic type of an object that invokes
1           a virtual call we may be able to devirtualize speculatively.
1 
1      'max-vrp-switch-assertions'
1           The maximum number of assertions to add along the default edge
1           of a switch statement during VRP. The default is 10.
1 
1      'unroll-jam-min-percent'
1           The minimum percentage of memory references that must be
1           optimized away for the unroll-and-jam transformation to be
1           considered profitable.
1 
1      'unroll-jam-max-unroll'
1           The maximum number of times the outer loop should be unrolled
1           by the unroll-and-jam transformation.
1