gcc: AArch64 Options

1 
1 3.18.1 AArch64 Options
1 ----------------------
1 
1 These options are defined for AArch64 implementations:
1 
1 '-mabi=NAME'
1      Generate code for the specified data model.  Permissible values are
1      'ilp32' for SysV-like data model where int, long int and pointers
1      are 32 bits, and 'lp64' for SysV-like data model where int is 32
1      bits, but long int and pointers are 64 bits.
1 
1      The default depends on the specific target configuration.  Note
1      that the LP64 and ILP32 ABIs are not link-compatible; you must
1      compile your entire program with the same ABI, and link with a
1      compatible set of libraries.
1 
1 '-mbig-endian'
1      Generate big-endian code.  This is the default when GCC is
1      configured for an 'aarch64_be-*-*' target.
1 
1 '-mgeneral-regs-only'
1      Generate code which uses only the general-purpose registers.  This
1      will prevent the compiler from using floating-point and Advanced
1      SIMD registers but will not impose any restrictions on the
1      assembler.
1 
1 '-mlittle-endian'
1      Generate little-endian code.  This is the default when GCC is
1      configured for an 'aarch64-*-*' but not an 'aarch64_be-*-*' target.
1 
1 '-mcmodel=tiny'
1      Generate code for the tiny code model.  The program and its
1      statically defined symbols must be within 1MB of each other.
1      Programs can be statically or dynamically linked.
1 
1 '-mcmodel=small'
1      Generate code for the small code model.  The program and its
1      statically defined symbols must be within 4GB of each other.
1      Programs can be statically or dynamically linked.  This is the
1      default code model.
1 
1 '-mcmodel=large'
1      Generate code for the large code model.  This makes no assumptions
1      about addresses and sizes of sections.  Programs can be statically
1      linked only.
1 
1 '-mstrict-align'
1      Avoid generating memory accesses that may not be aligned on a
1      natural object boundary as described in the architecture
1      specification.
1 
1 '-momit-leaf-frame-pointer'
1 '-mno-omit-leaf-frame-pointer'
1      Omit or keep the frame pointer in leaf functions.  The former
1      behavior is the default.
1 
1 '-mtls-dialect=desc'
1      Use TLS descriptors as the thread-local storage mechanism for
1      dynamic accesses of TLS variables.  This is the default.
1 
1 '-mtls-dialect=traditional'
1      Use traditional TLS as the thread-local storage mechanism for
1      dynamic accesses of TLS variables.
1 
1 '-mtls-size=SIZE'
1      Specify bit size of immediate TLS offsets.  Valid values are 12,
1      24, 32, 48.  This option requires binutils 2.26 or newer.
1 
1 '-mfix-cortex-a53-835769'
1 '-mno-fix-cortex-a53-835769'
1      Enable or disable the workaround for the ARM Cortex-A53 erratum
1      number 835769.  This involves inserting a NOP instruction between
1      memory instructions and 64-bit integer multiply-accumulate
1      instructions.
1 
1 '-mfix-cortex-a53-843419'
1 '-mno-fix-cortex-a53-843419'
1      Enable or disable the workaround for the ARM Cortex-A53 erratum
1      number 843419.  This erratum workaround is made at link time and
1      this will only pass the corresponding flag to the linker.
1 
1 '-mlow-precision-recip-sqrt'
1 '-mno-low-precision-recip-sqrt'
1      Enable or disable the reciprocal square root approximation.  This
1      option only has an effect if '-ffast-math' or
1      '-funsafe-math-optimizations' is used as well.  Enabling this
1      reduces precision of reciprocal square root results to about 16
1      bits for single precision and to 32 bits for double precision.
1 
1 '-mlow-precision-sqrt'
1 '-mno-low-precision-sqrt'
1      Enable or disable the square root approximation.  This option only
1      has an effect if '-ffast-math' or '-funsafe-math-optimizations' is
1      used as well.  Enabling this reduces precision of square root
1      results to about 16 bits for single precision and to 32 bits for
1      double precision.  If enabled, it implies
1      '-mlow-precision-recip-sqrt'.
1 
1 '-mlow-precision-div'
1 '-mno-low-precision-div'
1      Enable or disable the division approximation.  This option only has
1      an effect if '-ffast-math' or '-funsafe-math-optimizations' is used
1      as well.  Enabling this reduces precision of division results to
1      about 16 bits for single precision and to 32 bits for double
1      precision.
1 
1 '-moutline-atomics'
1 '-mno-outline-atomics'
1      Enable or disable calls to out-of-line helpers to implement atomic
1      operations.  These helpers will, at runtime, determine if the LSE
1      instructions from ARMv8.1-A can be used; if not, they will use the
1      load/store-exclusive instructions that are present in the base
1      ARMv8.0 ISA.
1 
1      This option is only applicable when compiling for the base ARMv8.0
1      instruction set.  If using a later revision, e.g.
1      '-march=armv8.1-a' or '-march=armv8-a+lse', the ARMv8.1-Atomics
1      instructions will be used directly.  The same applies when using
1      '-mcpu=' when the selected cpu supports the 'lse' feature.
1 
1 '-march=NAME'
1      Specify the name of the target architecture and, optionally, one or
1      more feature modifiers.  This option has the form
1      '-march=ARCH{+[no]FEATURE}*'.
1 
1      The permissible values for ARCH are 'armv8-a', 'armv8.1-a',
1      'armv8.2-a', 'armv8.3-a' or 'armv8.4-a' or NATIVE.
1 
1      The value 'armv8.4-a' implies 'armv8.3-a' and enables compiler
1      support for the ARMv8.4-A architecture extensions.
1 
1      The value 'armv8.3-a' implies 'armv8.2-a' and enables compiler
1      support for the ARMv8.3-A architecture extensions.
1 
1      The value 'armv8.2-a' implies 'armv8.1-a' and enables compiler
1      support for the ARMv8.2-A architecture extensions.
1 
1      The value 'armv8.1-a' implies 'armv8-a' and enables compiler
1      support for the ARMv8.1-A architecture extension.  In particular,
1      it enables the '+crc', '+lse', and '+rdma' features.
1 
1      The value 'native' is available on native AArch64 GNU/Linux and
1      causes the compiler to pick the architecture of the host system.
1      This option has no effect if the compiler is unable to recognize
1      the architecture of the host system,
1 
1      The permissible values for FEATURE are listed in the sub-section on
11      ⇒'-march' and '-mcpu' Feature Modifiers
      aarch64-feature-modifiers.  Where conflicting feature modifiers are
1      specified, the right-most feature is used.
1 
1      GCC uses NAME to determine what kind of instructions it can emit
1      when generating assembly code.  If '-march' is specified without
1      either of '-mtune' or '-mcpu' also being specified, the code is
1      tuned to perform well across a range of target processors
1      implementing the target architecture.
1 
1 '-mtune=NAME'
1      Specify the name of the target processor for which GCC should tune
1      the performance of the code.  Permissible values for this option
1      are: 'generic', 'cortex-a35', 'cortex-a53', 'cortex-a55',
1      'cortex-a57', 'cortex-a72', 'cortex-a73', 'cortex-a75',
1      'cortex-a76', 'ares', 'neoverse-n1', 'neoverse-n2', 'neoverse-v1',
1      'zeus', 'neoverse-512tvb', 'exynos-m1', 'falkor', 'qdf24xx',
1      'saphira', 'xgene1', 'vulcan', 'thunderx', 'thunderxt88',
1      'thunderxt88p1', 'thunderxt81', 'thunderxt83', 'thunderx2t99',
1      'cortex-a57.cortex-a53', 'cortex-a72.cortex-a53',
1      'cortex-a73.cortex-a35', 'cortex-a73.cortex-a53',
1      'cortex-a75.cortex-a55', 'native'.
1 
1      The values 'cortex-a57.cortex-a53', 'cortex-a72.cortex-a53',
1      'cortex-a73.cortex-a35', 'cortex-a73.cortex-a53',
1      'cortex-a75.cortex-a55' specify that GCC should tune for a
1      big.LITTLE system.
1 
1      The value 'neoverse-512tvb' specifies that GCC should tune for
1      Neoverse cores that (a) implement SVE and (b) have a total vector
1      bandwidth of 512 bits per cycle.  In other words, the option tells
1      GCC to tune for Neoverse cores that can execute 4 128-bit Advanced
1      SIMD arithmetic instructions a cycle and that can execute an
1      equivalent number of SVE arithmetic instructions per cycle (2 for
1      256-bit SVE, 4 for 128-bit SVE). This is more general than tuning
1      for a specific core like Neoverse V1 but is more specific than the
1      default tuning described below.
1 
1      Additionally on native AArch64 GNU/Linux systems the value 'native'
1      tunes performance to the host system.  This option has no effect if
1      the compiler is unable to recognize the processor of the host
1      system.
1 
1      Where none of '-mtune=', '-mcpu=' or '-march=' are specified, the
1      code is tuned to perform well across a range of target processors.
1 
1      This option cannot be suffixed by feature modifiers.
1 
1 '-mcpu=NAME'
1      Specify the name of the target processor, optionally suffixed by
1      one or more feature modifiers.  This option has the form
1      '-mcpu=CPU{+[no]FEATURE}*', where the permissible values for CPU
1      are the same as those available for '-mtune'.  The permissible
11      values for FEATURE are documented in the sub-section on ⇒
      '-march' and '-mcpu' Feature Modifiers aarch64-feature-modifiers.
1      Where conflicting feature modifiers are specified, the right-most
1      feature is used.
1 
1      GCC uses NAME to determine what kind of instructions it can emit
1      when generating assembly code (as if by '-march') and to determine
1      the target processor for which to tune for performance (as if by
1      '-mtune').  Where this option is used in conjunction with '-march'
1      or '-mtune', those options take precedence over the appropriate
1      part of this option.
1 
1      '-mcpu=neoverse-512tvb' is special in that it does not refer to a
1      specific core, but instead refers to all Neoverse cores that (a)
1      implement SVE and (b) have a total vector bandwidth of 512 bits a
1      cycle.  Unless overridden by '-march', '-mcpu=neoverse-512tvb'
1      generates code that can run on a Neoverse V1 core, since Neoverse
1      V1 is the first Neoverse core with these properties.  Unless
1      overridden by '-mtune', '-mcpu=neoverse-512tvb' tunes code in the
1      same way as for '-mtune=neoverse-512tvb'.
1 
1 '-moverride=STRING'
1      Override tuning decisions made by the back-end in response to a
1      '-mtune=' switch.  The syntax, semantics, and accepted values for
1      STRING in this option are not guaranteed to be consistent across
1      releases.
1 
1      This option is only intended to be useful when developing GCC.
1 
1 '-mverbose-cost-dump'
1      Enable verbose cost model dumping in the debug dump files.  This
1      option is provided for use in debugging the compiler.
1 
1 '-mpc-relative-literal-loads'
1 '-mno-pc-relative-literal-loads'
1      Enable or disable PC-relative literal loads.  With this option
1      literal pools are accessed using a single instruction and emitted
1      after each function.  This limits the maximum size of functions to
1      1MB. This is enabled by default for '-mcmodel=tiny'.
1 
1 '-msign-return-address=SCOPE'
1      Select the function scope on which return address signing will be
1      applied.  Permissible values are 'none', which disables return
1      address signing, 'non-leaf', which enables pointer signing for
1      functions which are not leaf functions, and 'all', which enables
1      pointer signing for all functions.  The default value is 'none'.
1 
1 '-msve-vector-bits=BITS'
1      Specify the number of bits in an SVE vector register.  This option
1      only has an effect when SVE is enabled.
1 
1      GCC supports two forms of SVE code generation: "vector-length
1      agnostic" output that works with any size of vector register and
1      "vector-length specific" output that allows GCC to make assumptions
1      about the vector length when it is useful for optimization reasons.
1      The possible values of 'bits' are: 'scalable', '128', '256', '512',
1      '1024' and '2048'.  Specifying 'scalable' selects vector-length
1      agnostic output.  At present '-msve-vector-bits=128' also generates
1      vector-length agnostic output.  All other values generate
1      vector-length specific code.  The behavior of these values may
1      change in future releases and no value except 'scalable' should be
1      relied on for producing code that is portable across different
1      hardware SVE vector lengths.
1 
1      The default is '-msve-vector-bits=scalable', which produces
1      vector-length agnostic code.
1 
1 3.18.1.1 '-march' and '-mcpu' Feature Modifiers
1 ...............................................
1 
1 Feature modifiers used with '-march' and '-mcpu' can be any of the
1 following and their inverses 'noFEATURE':
1 
1 'crc'
1      Enable CRC extension.  This is on by default for
1      '-march=armv8.1-a'.
1 'crypto'
1      Enable Crypto extension.  This also enables Advanced SIMD and
1      floating-point instructions.
1 'fp'
1      Enable floating-point instructions.  This is on by default for all
1      possible values for options '-march' and '-mcpu'.
1 'simd'
1      Enable Advanced SIMD instructions.  This also enables
1      floating-point instructions.  This is on by default for all
1      possible values for options '-march' and '-mcpu'.
1 'sve'
1      Enable Scalable Vector Extension instructions.  This also enables
1      Advanced SIMD and floating-point instructions.
1 'lse'
1      Enable Large System Extension instructions.  This is on by default
1      for '-march=armv8.1-a'.
1 'rdma'
1      Enable Round Double Multiply Accumulate instructions.  This is on
1      by default for '-march=armv8.1-a'.
1 'fp16'
1      Enable FP16 extension.  This also enables floating-point
1      instructions.
1 'fp16fml'
1      Enable FP16 fmla extension.  This also enables FP16 extensions and
1      floating-point instructions.  This option is enabled by default for
1      '-march=armv8.4-a'.  Use of this option with architectures prior to
1      Armv8.2-A is not supported.
1 
1 'rcpc'
1      Enable the RcPc extension.  This does not change code generation
1      from GCC, but is passed on to the assembler, enabling inline asm
1      statements to use instructions from the RcPc extension.
1 'dotprod'
1      Enable the Dot Product extension.  This also enables Advanced SIMD
1      instructions.
1 'aes'
1      Enable the Armv8-a aes and pmull crypto extension.  This also
1      enables Advanced SIMD instructions.
1 'sha2'
1      Enable the Armv8-a sha2 crypto extension.  This also enables
1      Advanced SIMD instructions.
1 'sha3'
1      Enable the sha512 and sha3 crypto extension.  This also enables
1      Advanced SIMD instructions.  Use of this option with architectures
1      prior to Armv8.2-A is not supported.
1 'sm4'
1      Enable the sm3 and sm4 crypto extension.  This also enables
1      Advanced SIMD instructions.  Use of this option with architectures
1      prior to Armv8.2-A is not supported.
1 
1  Feature 'crypto' implies 'aes', 'sha2', and 'simd', which implies 'fp'.
1 Conversely, 'nofp' implies 'nosimd', which implies 'nocrypto', 'noaes'
1 and 'nosha2'.
1