gcc: x86 Options

1 
1 3.18.56 x86 Options
1 -------------------
1 
1 These '-m' options are defined for the x86 family of computers.
1 
1 '-march=CPU-TYPE'
1      Generate instructions for the machine type CPU-TYPE.  In contrast
1      to '-mtune=CPU-TYPE', which merely tunes the generated code for the
1      specified CPU-TYPE, '-march=CPU-TYPE' allows GCC to generate code
1      that may not run at all on processors other than the one indicated.
1      Specifying '-march=CPU-TYPE' implies '-mtune=CPU-TYPE'.
1 
1      The choices for CPU-TYPE are:
1 
1      'native'
1           This selects the CPU to generate code for at compilation time
1           by determining the processor type of the compiling machine.
1           Using '-march=native' enables all instruction subsets
1           supported by the local machine (hence the result might not run
1           on different machines).  Using '-mtune=native' produces code
1           optimized for the local machine under the constraints of the
1           selected instruction set.
1 
1      'x86-64'
1           A generic CPU with 64-bit extensions.
1 
1      'i386'
1           Original Intel i386 CPU.
1 
1      'i486'
1           Intel i486 CPU.  (No scheduling is implemented for this chip.)
1 
1      'i586'
1      'pentium'
1           Intel Pentium CPU with no MMX support.
1 
1      'lakemont'
1           Intel Lakemont MCU, based on Intel Pentium CPU.
1 
1      'pentium-mmx'
1           Intel Pentium MMX CPU, based on Pentium core with MMX
1           instruction set support.
1 
1      'pentiumpro'
1           Intel Pentium Pro CPU.
1 
1      'i686'
1           When used with '-march', the Pentium Pro instruction set is
1           used, so the code runs on all i686 family chips.  When used
1           with '-mtune', it has the same meaning as 'generic'.
1 
1      'pentium2'
1           Intel Pentium II CPU, based on Pentium Pro core with MMX
1           instruction set support.
1 
1      'pentium3'
1      'pentium3m'
1           Intel Pentium III CPU, based on Pentium Pro core with MMX and
1           SSE instruction set support.
1 
1      'pentium-m'
1           Intel Pentium M; low-power version of Intel Pentium III CPU
1           with MMX, SSE and SSE2 instruction set support.  Used by
1           Centrino notebooks.
1 
1      'pentium4'
1      'pentium4m'
1           Intel Pentium 4 CPU with MMX, SSE and SSE2 instruction set
1           support.
1 
1      'prescott'
1           Improved version of Intel Pentium 4 CPU with MMX, SSE, SSE2
1           and SSE3 instruction set support.
1 
1      'nocona'
1           Improved version of Intel Pentium 4 CPU with 64-bit
1           extensions, MMX, SSE, SSE2 and SSE3 instruction set support.
1 
1      'core2'
1           Intel Core 2 CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3
1           and SSSE3 instruction set support.
1 
1      'nehalem'
1           Intel Nehalem CPU with 64-bit extensions, MMX, SSE, SSE2,
1           SSE3, SSSE3, SSE4.1, SSE4.2 and POPCNT instruction set
1           support.
1 
1      'westmere'
1           Intel Westmere CPU with 64-bit extensions, MMX, SSE, SSE2,
1           SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AES and PCLMUL
1           instruction set support.
1 
1      'sandybridge'
1           Intel Sandy Bridge CPU with 64-bit extensions, MMX, SSE, SSE2,
1           SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AES and PCLMUL
1           instruction set support.
1 
1      'ivybridge'
1           Intel Ivy Bridge CPU with 64-bit extensions, MMX, SSE, SSE2,
1           SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AES, PCLMUL,
1           FSGSBASE, RDRND and F16C instruction set support.
1 
1      'haswell'
1           Intel Haswell CPU with 64-bit extensions, MOVBE, MMX, SSE,
1           SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES,
1           PCLMUL, FSGSBASE, RDRND, FMA, BMI, BMI2 and F16C instruction
1           set support.
1 
1      'broadwell'
1           Intel Broadwell CPU with 64-bit extensions, MOVBE, MMX, SSE,
1           SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES,
1           PCLMUL, FSGSBASE, RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX
1           and PREFETCHW instruction set support.
1 
1      'skylake'
1           Intel Skylake CPU with 64-bit extensions, MOVBE, MMX, SSE,
1           SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES,
1           PCLMUL, FSGSBASE, RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX,
1           PREFETCHW, CLFLUSHOPT, XSAVEC and XSAVES instruction set
1           support.
1 
1      'bonnell'
1           Intel Bonnell CPU with 64-bit extensions, MOVBE, MMX, SSE,
1           SSE2, SSE3 and SSSE3 instruction set support.
1 
1      'silvermont'
1           Intel Silvermont CPU with 64-bit extensions, MOVBE, MMX, SSE,
1           SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AES, PCLMUL and
1           RDRND instruction set support.
1 
1      'knl'
1           Intel Knight's Landing CPU with 64-bit extensions, MOVBE, MMX,
1           SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2,
1           AES, PCLMUL, FSGSBASE, RDRND, FMA, BMI, BMI2, F16C, RDSEED,
1           ADCX, PREFETCHW, AVX512F, AVX512PF, AVX512ER and AVX512CD
1           instruction set support.
1 
1      'knm'
1           Intel Knights Mill CPU with 64-bit extensions, MOVBE, MMX,
1           SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2,
1           AES, PCLMUL, FSGSBASE, RDRND, FMA, BMI, BMI2, F16C, RDSEED,
1           ADCX, PREFETCHW, AVX512F, AVX512PF, AVX512ER, AVX512CD,
1           AVX5124VNNIW, AVX5124FMAPS and AVX512VPOPCNTDQ instruction set
1           support.
1 
1      'skylake-avx512'
1           Intel Skylake Server CPU with 64-bit extensions, MOVBE, MMX,
1           SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, PKU, AVX,
1           AVX2, AES, PCLMUL, FSGSBASE, RDRND, FMA, BMI, BMI2, F16C,
1           RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC, XSAVES, AVX512F,
1           CLWB, AVX512VL, AVX512BW, AVX512DQ and AVX512CD instruction
1           set support.
1 
1      'cannonlake'
1           Intel Cannonlake Server CPU with 64-bit extensions, MOVBE,
1           MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, PKU, AVX,
1           AVX2, AES, PCLMUL, FSGSBASE, RDRND, FMA, BMI, BMI2, F16C,
1           RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC, XSAVES, AVX512F,
1           AVX512VL, AVX512BW, AVX512DQ, AVX512CD, AVX512VBMI,
1           AVX512IFMA, SHA and UMIP instruction set support.
1 
1      'icelake-client'
1           Intel Icelake Client CPU with 64-bit extensions, MOVBE, MMX,
1           SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, PKU, AVX,
1           AVX2, AES, PCLMUL, FSGSBASE, RDRND, FMA, BMI, BMI2, F16C,
1           RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC, XSAVES, AVX512F,
1           AVX512VL, AVX512BW, AVX512DQ, AVX512CD, AVX512VBMI,
1           AVX512IFMA, SHA, CLWB, UMIP, RDPID, GFNI, AVX512VBMI2,
1           AVX512VPOPCNTDQ, AVX512BITALG, AVX512VNNI, VPCLMULQDQ, VAES
1           instruction set support.
1 
1      'icelake-server'
1           Intel Icelake Server CPU with 64-bit extensions, MOVBE, MMX,
1           SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, PKU, AVX,
1           AVX2, AES, PCLMUL, FSGSBASE, RDRND, FMA, BMI, BMI2, F16C,
1           RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC, XSAVES, AVX512F,
1           AVX512VL, AVX512BW, AVX512DQ, AVX512CD, AVX512VBMI,
1           AVX512IFMA, SHA, CLWB, UMIP, RDPID, GFNI, AVX512VBMI2,
1           AVX512VPOPCNTDQ, AVX512BITALG, AVX512VNNI, VPCLMULQDQ, VAES,
1           PCONFIG and WBNOINVD instruction set support.
1 
1      'k6'
1           AMD K6 CPU with MMX instruction set support.
1 
1      'k6-2'
1      'k6-3'
1           Improved versions of AMD K6 CPU with MMX and 3DNow!
1           instruction set support.
1 
1      'athlon'
1      'athlon-tbird'
1           AMD Athlon CPU with MMX, 3dNOW!, enhanced 3DNow! and SSE
1           prefetch instructions support.
1 
1      'athlon-4'
1      'athlon-xp'
1      'athlon-mp'
1           Improved AMD Athlon CPU with MMX, 3DNow!, enhanced 3DNow! and
1           full SSE instruction set support.
1 
1      'k8'
1      'opteron'
1      'athlon64'
1      'athlon-fx'
1           Processors based on the AMD K8 core with x86-64 instruction
1           set support, including the AMD Opteron, Athlon 64, and Athlon
1           64 FX processors.  (This supersets MMX, SSE, SSE2, 3DNow!,
1           enhanced 3DNow! and 64-bit instruction set extensions.)
1 
1      'k8-sse3'
1      'opteron-sse3'
1      'athlon64-sse3'
1           Improved versions of AMD K8 cores with SSE3 instruction set
1           support.
1 
1      'amdfam10'
1      'barcelona'
1           CPUs based on AMD Family 10h cores with x86-64 instruction set
1           support.  (This supersets MMX, SSE, SSE2, SSE3, SSE4A, 3DNow!,
1           enhanced 3DNow!, ABM and 64-bit instruction set extensions.)
1 
1      'bdver1'
1           CPUs based on AMD Family 15h cores with x86-64 instruction set
1           support.  (This supersets FMA4, AVX, XOP, LWP, AES, PCL_MUL,
1           CX16, MMX, SSE, SSE2, SSE3, SSE4A, SSSE3, SSE4.1, SSE4.2, ABM
1           and 64-bit instruction set extensions.)
1      'bdver2'
1           AMD Family 15h core based CPUs with x86-64 instruction set
1           support.  (This supersets BMI, TBM, F16C, FMA, FMA4, AVX, XOP,
1           LWP, AES, PCL_MUL, CX16, MMX, SSE, SSE2, SSE3, SSE4A, SSSE3,
1           SSE4.1, SSE4.2, ABM and 64-bit instruction set extensions.)
1      'bdver3'
1           AMD Family 15h core based CPUs with x86-64 instruction set
1           support.  (This supersets BMI, TBM, F16C, FMA, FMA4, FSGSBASE,
1           AVX, XOP, LWP, AES, PCL_MUL, CX16, MMX, SSE, SSE2, SSE3,
1           SSE4A, SSSE3, SSE4.1, SSE4.2, ABM and 64-bit instruction set
1           extensions.
1      'bdver4'
1           AMD Family 15h core based CPUs with x86-64 instruction set
1           support.  (This supersets BMI, BMI2, TBM, F16C, FMA, FMA4,
1           FSGSBASE, AVX, AVX2, XOP, LWP, AES, PCL_MUL, CX16, MOVBE, MMX,
1           SSE, SSE2, SSE3, SSE4A, SSSE3, SSE4.1, SSE4.2, ABM and 64-bit
1           instruction set extensions.
1 
1      'znver1'
1           AMD Family 17h core based CPUs with x86-64 instruction set
1           support.  (This supersets BMI, BMI2, F16C, FMA, FSGSBASE, AVX,
1           AVX2, ADCX, RDSEED, MWAITX, SHA, CLZERO, AES, PCL_MUL, CX16,
1           MOVBE, MMX, SSE, SSE2, SSE3, SSE4A, SSSE3, SSE4.1, SSE4.2,
1           ABM, XSAVEC, XSAVES, CLFLUSHOPT, POPCNT, and 64-bit
1           instruction set extensions.
1 
1      'btver1'
1           CPUs based on AMD Family 14h cores with x86-64 instruction set
1           support.  (This supersets MMX, SSE, SSE2, SSE3, SSSE3, SSE4A,
1           CX16, ABM and 64-bit instruction set extensions.)
1 
1      'btver2'
1           CPUs based on AMD Family 16h cores with x86-64 instruction set
1           support.  This includes MOVBE, F16C, BMI, AVX, PCL_MUL, AES,
1           SSE4.2, SSE4.1, CX16, ABM, SSE4A, SSSE3, SSE3, SSE2, SSE, MMX
1           and 64-bit instruction set extensions.
1 
1      'winchip-c6'
1           IDT WinChip C6 CPU, dealt in same way as i486 with additional
1           MMX instruction set support.
1 
1      'winchip2'
1           IDT WinChip 2 CPU, dealt in same way as i486 with additional
1           MMX and 3DNow! instruction set support.
1 
1      'c3'
1           VIA C3 CPU with MMX and 3DNow! instruction set support.  (No
1           scheduling is implemented for this chip.)
1 
1      'c3-2'
1           VIA C3-2 (Nehemiah/C5XL) CPU with MMX and SSE instruction set
1           support.  (No scheduling is implemented for this chip.)
1 
1      'c7'
1           VIA C7 (Esther) CPU with MMX, SSE, SSE2 and SSE3 instruction
1           set support.  (No scheduling is implemented for this chip.)
1 
1      'samuel-2'
1           VIA Eden Samuel 2 CPU with MMX and 3DNow! instruction set
1           support.  (No scheduling is implemented for this chip.)
1 
1      'nehemiah'
1           VIA Eden Nehemiah CPU with MMX and SSE instruction set
1           support.  (No scheduling is implemented for this chip.)
1 
1      'esther'
1           VIA Eden Esther CPU with MMX, SSE, SSE2 and SSE3 instruction
1           set support.  (No scheduling is implemented for this chip.)
1 
1      'eden-x2'
1           VIA Eden X2 CPU with x86-64, MMX, SSE, SSE2 and SSE3
1           instruction set support.  (No scheduling is implemented for
1           this chip.)
1 
1      'eden-x4'
1           VIA Eden X4 CPU with x86-64, MMX, SSE, SSE2, SSE3, SSSE3,
1           SSE4.1, SSE4.2, AVX and AVX2 instruction set support.  (No
1           scheduling is implemented for this chip.)
1 
1      'nano'
1           Generic VIA Nano CPU with x86-64, MMX, SSE, SSE2, SSE3 and
1           SSSE3 instruction set support.  (No scheduling is implemented
1           for this chip.)
1 
1      'nano-1000'
1           VIA Nano 1xxx CPU with x86-64, MMX, SSE, SSE2, SSE3 and SSSE3
1           instruction set support.  (No scheduling is implemented for
1           this chip.)
1 
1      'nano-2000'
1           VIA Nano 2xxx CPU with x86-64, MMX, SSE, SSE2, SSE3 and SSSE3
1           instruction set support.  (No scheduling is implemented for
1           this chip.)
1 
1      'nano-3000'
1           VIA Nano 3xxx CPU with x86-64, MMX, SSE, SSE2, SSE3, SSSE3 and
1           SSE4.1 instruction set support.  (No scheduling is implemented
1           for this chip.)
1 
1      'nano-x2'
1           VIA Nano Dual Core CPU with x86-64, MMX, SSE, SSE2, SSE3,
1           SSSE3 and SSE4.1 instruction set support.  (No scheduling is
1           implemented for this chip.)
1 
1      'nano-x4'
1           VIA Nano Quad Core CPU with x86-64, MMX, SSE, SSE2, SSE3,
1           SSSE3 and SSE4.1 instruction set support.  (No scheduling is
1           implemented for this chip.)
1 
1      'geode'
1           AMD Geode embedded processor with MMX and 3DNow! instruction
1           set support.
1 
1 '-mtune=CPU-TYPE'
1      Tune to CPU-TYPE everything applicable about the generated code,
1      except for the ABI and the set of available instructions.  While
1      picking a specific CPU-TYPE schedules things appropriately for that
1      particular chip, the compiler does not generate any code that
1      cannot run on the default machine type unless you use a
1      '-march=CPU-TYPE' option.  For example, if GCC is configured for
1      i686-pc-linux-gnu then '-mtune=pentium4' generates code that is
1      tuned for Pentium 4 but still runs on i686 machines.
1 
1      The choices for CPU-TYPE are the same as for '-march'.  In
1      addition, '-mtune' supports 2 extra choices for CPU-TYPE:
1 
1      'generic'
1           Produce code optimized for the most common IA32/AMD64/EM64T
1           processors.  If you know the CPU on which your code will run,
1           then you should use the corresponding '-mtune' or '-march'
1           option instead of '-mtune=generic'.  But, if you do not know
1           exactly what CPU users of your application will have, then you
1           should use this option.
1 
1           As new processors are deployed in the marketplace, the
1           behavior of this option will change.  Therefore, if you
1           upgrade to a newer version of GCC, code generation controlled
1           by this option will change to reflect the processors that are
1           most common at the time that version of GCC is released.
1 
1           There is no '-march=generic' option because '-march' indicates
1           the instruction set the compiler can use, and there is no
1           generic instruction set applicable to all processors.  In
1           contrast, '-mtune' indicates the processor (or, in this case,
1           collection of processors) for which the code is optimized.
1 
1      'intel'
1           Produce code optimized for the most current Intel processors,
1           which are Haswell and Silvermont for this version of GCC. If
1           you know the CPU on which your code will run, then you should
1           use the corresponding '-mtune' or '-march' option instead of
1           '-mtune=intel'.  But, if you want your application performs
1           better on both Haswell and Silvermont, then you should use
1           this option.
1 
1           As new Intel processors are deployed in the marketplace, the
1           behavior of this option will change.  Therefore, if you
1           upgrade to a newer version of GCC, code generation controlled
1           by this option will change to reflect the most current Intel
1           processors at the time that version of GCC is released.
1 
1           There is no '-march=intel' option because '-march' indicates
1           the instruction set the compiler can use, and there is no
1           common instruction set applicable to all processors.  In
1           contrast, '-mtune' indicates the processor (or, in this case,
1           collection of processors) for which the code is optimized.
1 
1 '-mcpu=CPU-TYPE'
1      A deprecated synonym for '-mtune'.
1 
1 '-mfpmath=UNIT'
1      Generate floating-point arithmetic for selected unit UNIT.  The
1      choices for UNIT are:
1 
1      '387'
1           Use the standard 387 floating-point coprocessor present on the
1           majority of chips and emulated otherwise.  Code compiled with
1           this option runs almost everywhere.  The temporary results are
1           computed in 80-bit precision instead of the precision
1           specified by the type, resulting in slightly different results
1           compared to most of other chips.  See '-ffloat-store' for more
1           detailed description.
1 
1           This is the default choice for non-Darwin x86-32 targets.
1 
1      'sse'
1           Use scalar floating-point instructions present in the SSE
1           instruction set.  This instruction set is supported by Pentium
1           III and newer chips, and in the AMD line by Athlon-4, Athlon
1           XP and Athlon MP chips.  The earlier version of the SSE
1           instruction set supports only single-precision arithmetic,
1           thus the double and extended-precision arithmetic are still
1           done using 387.  A later version, present only in Pentium 4
1           and AMD x86-64 chips, supports double-precision arithmetic
1           too.
1 
1           For the x86-32 compiler, you must use '-march=CPU-TYPE',
1           '-msse' or '-msse2' switches to enable SSE extensions and make
1           this option effective.  For the x86-64 compiler, these
1           extensions are enabled by default.
1 
1           The resulting code should be considerably faster in the
1           majority of cases and avoid the numerical instability problems
1           of 387 code, but may break some existing code that expects
1           temporaries to be 80 bits.
1 
1           This is the default choice for the x86-64 compiler, Darwin
1           x86-32 targets, and the default choice for x86-32 targets with
1           the SSE2 instruction set when '-ffast-math' is enabled.
1 
1      'sse,387'
1      'sse+387'
1      'both'
1           Attempt to utilize both instruction sets at once.  This
1           effectively doubles the amount of available registers, and on
1           chips with separate execution units for 387 and SSE the
1           execution resources too.  Use this option with care, as it is
1           still experimental, because the GCC register allocator does
1           not model separate functional units well, resulting in
1           unstable performance.
1 
1 '-masm=DIALECT'
1      Output assembly instructions using selected DIALECT.  Also affects
1      which dialect is used for basic 'asm' (⇒Basic Asm) and
1      extended 'asm' (⇒Extended Asm).  Supported choices (in
1      dialect order) are 'att' or 'intel'.  The default is 'att'.  Darwin
1      does not support 'intel'.
1 
1 '-mieee-fp'
1 '-mno-ieee-fp'
1      Control whether or not the compiler uses IEEE floating-point
1      comparisons.  These correctly handle the case where the result of a
1      comparison is unordered.
1 
1 '-m80387'
1 '-mhard-float'
1      Generate output containing 80387 instructions for floating point.
1 
1 '-mno-80387'
1 '-msoft-float'
1      Generate output containing library calls for floating point.
1 
1      *Warning:* the requisite libraries are not part of GCC.  Normally
1      the facilities of the machine's usual C compiler are used, but this
1      cannot be done directly in cross-compilation.  You must make your
1      own arrangements to provide suitable library functions for
1      cross-compilation.
1 
1      On machines where a function returns floating-point results in the
1      80387 register stack, some floating-point opcodes may be emitted
1      even if '-msoft-float' is used.
1 
1 '-mno-fp-ret-in-387'
1      Do not use the FPU registers for return values of functions.
1 
1      The usual calling convention has functions return values of types
1      'float' and 'double' in an FPU register, even if there is no FPU.
1      The idea is that the operating system should emulate an FPU.
1 
1      The option '-mno-fp-ret-in-387' causes such values to be returned
1      in ordinary CPU registers instead.
1 
1 '-mno-fancy-math-387'
1      Some 387 emulators do not support the 'sin', 'cos' and 'sqrt'
1      instructions for the 387.  Specify this option to avoid generating
1      those instructions.  This option is the default on OpenBSD and
1      NetBSD.  This option is overridden when '-march' indicates that the
1      target CPU always has an FPU and so the instruction does not need
1      emulation.  These instructions are not generated unless you also
1      use the '-funsafe-math-optimizations' switch.
1 
1 '-malign-double'
1 '-mno-align-double'
1      Control whether GCC aligns 'double', 'long double', and 'long long'
1      variables on a two-word boundary or a one-word boundary.  Aligning
1      'double' variables on a two-word boundary produces code that runs
1      somewhat faster on a Pentium at the expense of more memory.
1 
1      On x86-64, '-malign-double' is enabled by default.
1 
1      *Warning:* if you use the '-malign-double' switch, structures
1      containing the above types are aligned differently than the
1      published application binary interface specifications for the
1      x86-32 and are not binary compatible with structures in code
1      compiled without that switch.
1 
1 '-m96bit-long-double'
1 '-m128bit-long-double'
1      These switches control the size of 'long double' type.  The x86-32
1      application binary interface specifies the size to be 96 bits, so
1      '-m96bit-long-double' is the default in 32-bit mode.
1 
1      Modern architectures (Pentium and newer) prefer 'long double' to be
1      aligned to an 8- or 16-byte boundary.  In arrays or structures
1      conforming to the ABI, this is not possible.  So specifying
1      '-m128bit-long-double' aligns 'long double' to a 16-byte boundary
1      by padding the 'long double' with an additional 32-bit zero.
1 
1      In the x86-64 compiler, '-m128bit-long-double' is the default
1      choice as its ABI specifies that 'long double' is aligned on
1      16-byte boundary.
1 
1      Notice that neither of these options enable any extra precision
1      over the x87 standard of 80 bits for a 'long double'.
1 
1      *Warning:* if you override the default value for your target ABI,
1      this changes the size of structures and arrays containing 'long
1      double' variables, as well as modifying the function calling
1      convention for functions taking 'long double'.  Hence they are not
1      binary-compatible with code compiled without that switch.
1 
1 '-mlong-double-64'
1 '-mlong-double-80'
1 '-mlong-double-128'
1      These switches control the size of 'long double' type.  A size of
1      64 bits makes the 'long double' type equivalent to the 'double'
1      type.  This is the default for 32-bit Bionic C library.  A size of
1      128 bits makes the 'long double' type equivalent to the
1      '__float128' type.  This is the default for 64-bit Bionic C
1      library.
1 
1      *Warning:* if you override the default value for your target ABI,
1      this changes the size of structures and arrays containing 'long
1      double' variables, as well as modifying the function calling
1      convention for functions taking 'long double'.  Hence they are not
1      binary-compatible with code compiled without that switch.
1 
1 '-malign-data=TYPE'
1      Control how GCC aligns variables.  Supported values for TYPE are
1      'compat' uses increased alignment value compatible uses GCC 4.8 and
1      earlier, 'abi' uses alignment value as specified by the psABI, and
1      'cacheline' uses increased alignment value to match the cache line
1      size.  'compat' is the default.
1 
1 '-mlarge-data-threshold=THRESHOLD'
1      When '-mcmodel=medium' is specified, data objects larger than
1      THRESHOLD are placed in the large data section.  This value must be
1      the same across all objects linked into the binary, and defaults to
1      65535.
1 
1 '-mrtd'
1      Use a different function-calling convention, in which functions
1      that take a fixed number of arguments return with the 'ret NUM'
1      instruction, which pops their arguments while returning.  This
1      saves one instruction in the caller since there is no need to pop
1      the arguments there.
1 
1      You can specify that an individual function is called with this
1      calling sequence with the function attribute 'stdcall'.  You can
1      also override the '-mrtd' option by using the function attribute
1      'cdecl'.  ⇒Function Attributes.
1 
1      *Warning:* this calling convention is incompatible with the one
1      normally used on Unix, so you cannot use it if you need to call
1      libraries compiled with the Unix compiler.
1 
1      Also, you must provide function prototypes for all functions that
1      take variable numbers of arguments (including 'printf'); otherwise
1      incorrect code is generated for calls to those functions.
1 
1      In addition, seriously incorrect code results if you call a
1      function with too many arguments.  (Normally, extra arguments are
1      harmlessly ignored.)
1 
1 '-mregparm=NUM'
1      Control how many registers are used to pass integer arguments.  By
1      default, no registers are used to pass arguments, and at most 3
1      registers can be used.  You can control this behavior for a
11      specific function by using the function attribute 'regparm'.  ⇒
      Function Attributes.
1 
1      *Warning:* if you use this switch, and NUM is nonzero, then you
1      must build all modules with the same value, including any
1      libraries.  This includes the system libraries and startup modules.
1 
1 '-msseregparm'
1      Use SSE register passing conventions for float and double arguments
1      and return values.  You can control this behavior for a specific
11      function by using the function attribute 'sseregparm'.  ⇒
      Function Attributes.
1 
1      *Warning:* if you use this switch then you must build all modules
1      with the same value, including any libraries.  This includes the
1      system libraries and startup modules.
1 
1 '-mvect8-ret-in-mem'
1      Return 8-byte vectors in memory instead of MMX registers.  This is
1      the default on Solaris 8 and 9 and VxWorks to match the ABI of the
1      Sun Studio compilers until version 12.  Later compiler versions
1      (starting with Studio 12 Update 1) follow the ABI used by other x86
1      targets, which is the default on Solaris 10 and later.  _Only_ use
1      this option if you need to remain compatible with existing code
1      produced by those previous compiler versions or older versions of
1      GCC.
1 
1 '-mpc32'
1 '-mpc64'
1 '-mpc80'
1 
1      Set 80387 floating-point precision to 32, 64 or 80 bits.  When
1      '-mpc32' is specified, the significands of results of
1      floating-point operations are rounded to 24 bits (single
1      precision); '-mpc64' rounds the significands of results of
1      floating-point operations to 53 bits (double precision) and
1      '-mpc80' rounds the significands of results of floating-point
1      operations to 64 bits (extended double precision), which is the
1      default.  When this option is used, floating-point operations in
1      higher precisions are not available to the programmer without
1      setting the FPU control word explicitly.
1 
1      Setting the rounding of floating-point operations to less than the
1      default 80 bits can speed some programs by 2% or more.  Note that
1      some mathematical libraries assume that extended-precision (80-bit)
1      floating-point operations are enabled by default; routines in such
1      libraries could suffer significant loss of accuracy, typically
1      through so-called "catastrophic cancellation", when this option is
1      used to set the precision to less than extended precision.
1 
1 '-mstackrealign'
1      Realign the stack at entry.  On the x86, the '-mstackrealign'
1      option generates an alternate prologue and epilogue that realigns
1      the run-time stack if necessary.  This supports mixing legacy codes
1      that keep 4-byte stack alignment with modern codes that keep
1      16-byte stack alignment for SSE compatibility.  See also the
1      attribute 'force_align_arg_pointer', applicable to individual
1      functions.
1 
1 '-mpreferred-stack-boundary=NUM'
1      Attempt to keep the stack boundary aligned to a 2 raised to NUM
1      byte boundary.  If '-mpreferred-stack-boundary' is not specified,
1      the default is 4 (16 bytes or 128 bits).
1 
1      *Warning:* When generating code for the x86-64 architecture with
1      SSE extensions disabled, '-mpreferred-stack-boundary=3' can be used
1      to keep the stack boundary aligned to 8 byte boundary.  Since
1      x86-64 ABI require 16 byte stack alignment, this is ABI
1      incompatible and intended to be used in controlled environment
1      where stack space is important limitation.  This option leads to
1      wrong code when functions compiled with 16 byte stack alignment
1      (such as functions from a standard library) are called with
1      misaligned stack.  In this case, SSE instructions may lead to
1      misaligned memory access traps.  In addition, variable arguments
1      are handled incorrectly for 16 byte aligned objects (including x87
1      long double and __int128), leading to wrong results.  You must
1      build all modules with '-mpreferred-stack-boundary=3', including
1      any libraries.  This includes the system libraries and startup
1      modules.
1 
1 '-mincoming-stack-boundary=NUM'
1      Assume the incoming stack is aligned to a 2 raised to NUM byte
1      boundary.  If '-mincoming-stack-boundary' is not specified, the one
1      specified by '-mpreferred-stack-boundary' is used.
1 
1      On Pentium and Pentium Pro, 'double' and 'long double' values
1      should be aligned to an 8-byte boundary (see '-malign-double') or
1      suffer significant run time performance penalties.  On Pentium III,
1      the Streaming SIMD Extension (SSE) data type '__m128' may not work
1      properly if it is not 16-byte aligned.
1 
1      To ensure proper alignment of this values on the stack, the stack
1      boundary must be as aligned as that required by any value stored on
1      the stack.  Further, every function must be generated such that it
1      keeps the stack aligned.  Thus calling a function compiled with a
1      higher preferred stack boundary from a function compiled with a
1      lower preferred stack boundary most likely misaligns the stack.  It
1      is recommended that libraries that use callbacks always use the
1      default setting.
1 
1      This extra alignment does consume extra stack space, and generally
1      increases code size.  Code that is sensitive to stack space usage,
1      such as embedded systems and operating system kernels, may want to
1      reduce the preferred alignment to '-mpreferred-stack-boundary=2'.
1 
1 '-mmmx'
1 '-msse'
1 '-msse2'
1 '-msse3'
1 '-mssse3'
1 '-msse4'
1 '-msse4a'
1 '-msse4.1'
1 '-msse4.2'
1 '-mavx'
1 '-mavx2'
1 '-mavx512f'
1 '-mavx512pf'
1 '-mavx512er'
1 '-mavx512cd'
1 '-mavx512vl'
1 '-mavx512bw'
1 '-mavx512dq'
1 '-mavx512ifma'
1 '-mavx512vbmi'
1 '-msha'
1 '-maes'
1 '-mpclmul'
1 '-mclflushopt'
1 '-mclwb'
1 '-mfsgsbase'
1 '-mrdrnd'
1 '-mf16c'
1 '-mfma'
1 '-mpconfig'
1 '-mwbnoinvd'
1 '-mfma4'
1 '-mprfchw'
1 '-mrdpid'
1 '-mprefetchwt1'
1 '-mrdseed'
1 '-msgx'
1 '-mxop'
1 '-mlwp'
1 '-m3dnow'
1 '-m3dnowa'
1 '-mpopcnt'
1 '-mabm'
1 '-madx'
1 '-mbmi'
1 '-mbmi2'
1 '-mlzcnt'
1 '-mfxsr'
1 '-mxsave'
1 '-mxsaveopt'
1 '-mxsavec'
1 '-mxsaves'
1 '-mrtm'
1 '-mhle'
1 '-mtbm'
1 '-mmpx'
1 '-mmwaitx'
1 '-mclzero'
1 '-mpku'
1 '-mavx512vbmi2'
1 '-mgfni'
1 '-mvaes'
1 '-mvpclmulqdq'
1 '-mavx512bitalg'
1 '-mmovdiri'
1 '-mmovdir64b'
1 '-mavx512vpopcntdq'
1 '-mavx5124fmaps'
1 '-mavx512vnni'
1 '-mavx5124vnniw'
1      These switches enable the use of instructions in the MMX, SSE,
1      SSE2, SSE3, SSSE3, SSE4, SSE4A, SSE4.1, SSE4.2, AVX, AVX2, AVX512F,
1      AVX512PF, AVX512ER, AVX512CD, AVX512VL, AVX512BW, AVX512DQ,
1      AVX512IFMA, AVX512VBMI, SHA, AES, PCLMUL, CLFLUSHOPT, CLWB,
1      FSGSBASE, RDRND, F16C, FMA, PCONFIG, WBNOINVD, FMA4, PREFETCHW,
1      RDPID, PREFETCHWT1, RDSEED, SGX, XOP, LWP, 3DNow!, enhanced 3DNow!,
1      POPCNT, ABM, ADX, BMI, BMI2, LZCNT, FXSR, XSAVE, XSAVEOPT, XSAVEC,
1      XSAVES, RTM, HLE, TBM, MPX, MWAITX, CLZERO, PKU, AVX512VBMI2, GFNI,
1      VAES, VPCLMULQDQ, AVX512BITALG, MOVDIRI, MOVDIR64B,
1      AVX512VPOPCNTDQ, AVX5124FMAPS, AVX512VNNI, or AVX5124VNNIW extended
1      instruction sets.  Each has a corresponding '-mno-' option to
1      disable use of these instructions.
1 
1      These extensions are also available as built-in functions: see
1      ⇒x86 Built-in Functions, for details of the functions
1      enabled and disabled by these switches.
1 
1      To generate SSE/SSE2 instructions automatically from floating-point
1      code (as opposed to 387 instructions), see '-mfpmath=sse'.
1 
1      GCC depresses SSEx instructions when '-mavx' is used.  Instead, it
1      generates new AVX instructions or AVX equivalence for all SSEx
1      instructions when needed.
1 
1      These options enable GCC to use these extended instructions in
1      generated code, even without '-mfpmath=sse'.  Applications that
1      perform run-time CPU detection must compile separate files for each
1      supported architecture, using the appropriate flags.  In
1      particular, the file containing the CPU detection code should be
1      compiled without these options.
1 
1 '-mdump-tune-features'
1      This option instructs GCC to dump the names of the x86 performance
1      tuning features and default settings.  The names can be used in
1      '-mtune-ctrl=FEATURE-LIST'.
1 
1 '-mtune-ctrl=FEATURE-LIST'
1      This option is used to do fine grain control of x86 code generation
1      features.  FEATURE-LIST is a comma separated list of FEATURE names.
1      See also '-mdump-tune-features'.  When specified, the FEATURE is
1      turned on if it is not preceded with '^', otherwise, it is turned
1      off.  '-mtune-ctrl=FEATURE-LIST' is intended to be used by GCC
1      developers.  Using it may lead to code paths not covered by testing
1      and can potentially result in compiler ICEs or runtime errors.
1 
1 '-mno-default'
1      This option instructs GCC to turn off all tunable features.  See
1      also '-mtune-ctrl=FEATURE-LIST' and '-mdump-tune-features'.
1 
1 '-mcld'
1      This option instructs GCC to emit a 'cld' instruction in the
1      prologue of functions that use string instructions.  String
1      instructions depend on the DF flag to select between autoincrement
1      or autodecrement mode.  While the ABI specifies the DF flag to be
1      cleared on function entry, some operating systems violate this
1      specification by not clearing the DF flag in their exception
1      dispatchers.  The exception handler can be invoked with the DF flag
1      set, which leads to wrong direction mode when string instructions
1      are used.  This option can be enabled by default on 32-bit x86
1      targets by configuring GCC with the '--enable-cld' configure
1      option.  Generation of 'cld' instructions can be suppressed with
1      the '-mno-cld' compiler option in this case.
1 
1 '-mvzeroupper'
1      This option instructs GCC to emit a 'vzeroupper' instruction before
1      a transfer of control flow out of the function to minimize the AVX
1      to SSE transition penalty as well as remove unnecessary 'zeroupper'
1      intrinsics.
1 
1 '-mprefer-avx128'
1      This option instructs GCC to use 128-bit AVX instructions instead
1      of 256-bit AVX instructions in the auto-vectorizer.
1 
1 '-mprefer-vector-width=OPT'
1      This option instructs GCC to use OPT-bit vector width in
1      instructions instead of default on the selected platform.
1 
1      'none'
1           No extra limitations applied to GCC other than defined by the
1           selected platform.
1 
1      '128'
1           Prefer 128-bit vector width for instructions.
1 
1      '256'
1           Prefer 256-bit vector width for instructions.
1 
1      '512'
1           Prefer 512-bit vector width for instructions.
1 
1 '-mcx16'
1      This option enables GCC to generate 'CMPXCHG16B' instructions in
1      64-bit code to implement compare-and-exchange operations on 16-byte
1      aligned 128-bit objects.  This is useful for atomic updates of data
1      structures exceeding one machine word in size.  The compiler uses
1      this instruction to implement ⇒__sync Builtins.  However,
1      for ⇒__atomic Builtins operating on 128-bit integers, a
1      library call is always used.
1 
1 '-msahf'
1      This option enables generation of 'SAHF' instructions in 64-bit
1      code.  Early Intel Pentium 4 CPUs with Intel 64 support, prior to
1      the introduction of Pentium 4 G1 step in December 2005, lacked the
1      'LAHF' and 'SAHF' instructions which are supported by AMD64.  These
1      are load and store instructions, respectively, for certain status
1      flags.  In 64-bit mode, the 'SAHF' instruction is used to optimize
11      'fmod', 'drem', and 'remainder' built-in functions; see ⇒Other
      Builtins for details.
1 
1 '-mmovbe'
1      This option enables use of the 'movbe' instruction to implement
1      '__builtin_bswap32' and '__builtin_bswap64'.
1 
1 '-mshstk'
1      The '-mshstk' option enables shadow stack built-in functions from
1      x86 Control-flow Enforcement Technology (CET).
1 
1 '-mcrc32'
1      This option enables built-in functions '__builtin_ia32_crc32qi',
1      '__builtin_ia32_crc32hi', '__builtin_ia32_crc32si' and
1      '__builtin_ia32_crc32di' to generate the 'crc32' machine
1      instruction.
1 
1 '-mrecip'
1      This option enables use of 'RCPSS' and 'RSQRTSS' instructions (and
1      their vectorized variants 'RCPPS' and 'RSQRTPS') with an additional
1      Newton-Raphson step to increase precision instead of 'DIVSS' and
1      'SQRTSS' (and their vectorized variants) for single-precision
1      floating-point arguments.  These instructions are generated only
1      when '-funsafe-math-optimizations' is enabled together with
1      '-ffinite-math-only' and '-fno-trapping-math'.  Note that while the
1      throughput of the sequence is higher than the throughput of the
1      non-reciprocal instruction, the precision of the sequence can be
1      decreased by up to 2 ulp (i.e.  the inverse of 1.0 equals
1      0.99999994).
1 
1      Note that GCC implements '1.0f/sqrtf(X)' in terms of 'RSQRTSS' (or
1      'RSQRTPS') already with '-ffast-math' (or the above option
1      combination), and doesn't need '-mrecip'.
1 
1      Also note that GCC emits the above sequence with additional
1      Newton-Raphson step for vectorized single-float division and
1      vectorized 'sqrtf(X)' already with '-ffast-math' (or the above
1      option combination), and doesn't need '-mrecip'.
1 
1 '-mrecip=OPT'
1      This option controls which reciprocal estimate instructions may be
1      used.  OPT is a comma-separated list of options, which may be
1      preceded by a '!' to invert the option:
1 
1      'all'
1           Enable all estimate instructions.
1 
1      'default'
1           Enable the default instructions, equivalent to '-mrecip'.
1 
1      'none'
1           Disable all estimate instructions, equivalent to '-mno-recip'.
1 
1      'div'
1           Enable the approximation for scalar division.
1 
1      'vec-div'
1           Enable the approximation for vectorized division.
1 
1      'sqrt'
1           Enable the approximation for scalar square root.
1 
1      'vec-sqrt'
1           Enable the approximation for vectorized square root.
1 
1      So, for example, '-mrecip=all,!sqrt' enables all of the reciprocal
1      approximations, except for square root.
1 
1 '-mveclibabi=TYPE'
1      Specifies the ABI type to use for vectorizing intrinsics using an
1      external library.  Supported values for TYPE are 'svml' for the
1      Intel short vector math library and 'acml' for the AMD math core
1      library.  To use this option, both '-ftree-vectorize' and
1      '-funsafe-math-optimizations' have to be enabled, and an SVML or
1      ACML ABI-compatible library must be specified at link time.
1 
1      GCC currently emits calls to 'vmldExp2', 'vmldLn2', 'vmldLog102',
1      'vmldPow2', 'vmldTanh2', 'vmldTan2', 'vmldAtan2', 'vmldAtanh2',
1      'vmldCbrt2', 'vmldSinh2', 'vmldSin2', 'vmldAsinh2', 'vmldAsin2',
1      'vmldCosh2', 'vmldCos2', 'vmldAcosh2', 'vmldAcos2', 'vmlsExp4',
1      'vmlsLn4', 'vmlsLog104', 'vmlsPow4', 'vmlsTanh4', 'vmlsTan4',
1      'vmlsAtan4', 'vmlsAtanh4', 'vmlsCbrt4', 'vmlsSinh4', 'vmlsSin4',
1      'vmlsAsinh4', 'vmlsAsin4', 'vmlsCosh4', 'vmlsCos4', 'vmlsAcosh4'
1      and 'vmlsAcos4' for corresponding function type when
1      '-mveclibabi=svml' is used, and '__vrd2_sin', '__vrd2_cos',
1      '__vrd2_exp', '__vrd2_log', '__vrd2_log2', '__vrd2_log10',
1      '__vrs4_sinf', '__vrs4_cosf', '__vrs4_expf', '__vrs4_logf',
1      '__vrs4_log2f', '__vrs4_log10f' and '__vrs4_powf' for the
1      corresponding function type when '-mveclibabi=acml' is used.
1 
1 '-mabi=NAME'
1      Generate code for the specified calling convention.  Permissible
1      values are 'sysv' for the ABI used on GNU/Linux and other systems,
1      and 'ms' for the Microsoft ABI. The default is to use the Microsoft
1      ABI when targeting Microsoft Windows and the SysV ABI on all other
1      systems.  You can control this behavior for specific functions by
11      using the function attributes 'ms_abi' and 'sysv_abi'.  ⇒
      Function Attributes.
1 
1 '-mforce-indirect-call'
1      Force all calls to functions to be indirect.  This is useful when
1      using Intel Processor Trace where it generates more precise timing
1      information for function calls.
1 
1 '-mcall-ms2sysv-xlogues'
1      Due to differences in 64-bit ABIs, any Microsoft ABI function that
1      calls a System V ABI function must consider RSI, RDI and XMM6-15 as
1      clobbered.  By default, the code for saving and restoring these
1      registers is emitted inline, resulting in fairly lengthy prologues
1      and epilogues.  Using '-mcall-ms2sysv-xlogues' emits prologues and
1      epilogues that use stubs in the static portion of libgcc to perform
1      these saves and restores, thus reducing function size at the cost
1      of a few extra instructions.
1 
1 '-mtls-dialect=TYPE'
1      Generate code to access thread-local storage using the 'gnu' or
1      'gnu2' conventions.  'gnu' is the conservative default; 'gnu2' is
1      more efficient, but it may add compile- and run-time requirements
1      that cannot be satisfied on all systems.
1 
1 '-mpush-args'
1 '-mno-push-args'
1      Use PUSH operations to store outgoing parameters.  This method is
1      shorter and usually equally fast as method using SUB/MOV operations
1      and is enabled by default.  In some cases disabling it may improve
1      performance because of improved scheduling and reduced
1      dependencies.
1 
1 '-maccumulate-outgoing-args'
1      If enabled, the maximum amount of space required for outgoing
1      arguments is computed in the function prologue.  This is faster on
1      most modern CPUs because of reduced dependencies, improved
1      scheduling and reduced stack usage when the preferred stack
1      boundary is not equal to 2.  The drawback is a notable increase in
1      code size.  This switch implies '-mno-push-args'.
1 
1 '-mthreads'
1      Support thread-safe exception handling on MinGW. Programs that rely
1      on thread-safe exception handling must compile and link all code
1      with the '-mthreads' option.  When compiling, '-mthreads' defines
1      '-D_MT'; when linking, it links in a special thread helper library
1      '-lmingwthrd' which cleans up per-thread exception-handling data.
1 
1 '-mms-bitfields'
1 '-mno-ms-bitfields'
1 
1      Enable/disable bit-field layout compatible with the native
1      Microsoft Windows compiler.
1 
1      If 'packed' is used on a structure, or if bit-fields are used, it
1      may be that the Microsoft ABI lays out the structure differently
1      than the way GCC normally does.  Particularly when moving packed
1      data between functions compiled with GCC and the native Microsoft
1      compiler (either via function call or as data in a file), it may be
1      necessary to access either format.
1 
1      This option is enabled by default for Microsoft Windows targets.
1      This behavior can also be controlled locally by use of variable or
11      type attributes.  For more information, see ⇒x86 Variable
      Attributes and ⇒x86 Type Attributes.
1 
1      The Microsoft structure layout algorithm is fairly simple with the
1      exception of the bit-field packing.  The padding and alignment of
1      members of structures and whether a bit-field can straddle a
1      storage-unit boundary are determine by these rules:
1 
1        1. Structure members are stored sequentially in the order in
1           which they are declared: the first member has the lowest
1           memory address and the last member the highest.
1 
1        2. Every data object has an alignment requirement.  The alignment
1           requirement for all data except structures, unions, and arrays
1           is either the size of the object or the current packing size
1           (specified with either the 'aligned' attribute or the 'pack'
1           pragma), whichever is less.  For structures, unions, and
1           arrays, the alignment requirement is the largest alignment
1           requirement of its members.  Every object is allocated an
1           offset so that:
1 
1                offset % alignment_requirement == 0
1 
1        3. Adjacent bit-fields are packed into the same 1-, 2-, or 4-byte
1           allocation unit if the integral types are the same size and if
1           the next bit-field fits into the current allocation unit
1           without crossing the boundary imposed by the common alignment
1           requirements of the bit-fields.
1 
1      MSVC interprets zero-length bit-fields in the following ways:
1 
1        1. If a zero-length bit-field is inserted between two bit-fields
1           that are normally coalesced, the bit-fields are not coalesced.
1 
1           For example:
1 
1                struct
1                 {
1                   unsigned long bf_1 : 12;
1                   unsigned long : 0;
1                   unsigned long bf_2 : 12;
1                 } t1;
1 
1           The size of 't1' is 8 bytes with the zero-length bit-field.
1           If the zero-length bit-field were removed, 't1''s size would
1           be 4 bytes.
1 
1        2. If a zero-length bit-field is inserted after a bit-field,
1           'foo', and the alignment of the zero-length bit-field is
1           greater than the member that follows it, 'bar', 'bar' is
1           aligned as the type of the zero-length bit-field.
1 
1           For example:
1 
1                struct
1                 {
1                   char foo : 4;
1                   short : 0;
1                   char bar;
1                 } t2;
1 
1                struct
1                 {
1                   char foo : 4;
1                   short : 0;
1                   double bar;
1                 } t3;
1 
1           For 't2', 'bar' is placed at offset 2, rather than offset 1.
1           Accordingly, the size of 't2' is 4.  For 't3', the zero-length
1           bit-field does not affect the alignment of 'bar' or, as a
1           result, the size of the structure.
1 
1           Taking this into account, it is important to note the
1           following:
1 
1             1. If a zero-length bit-field follows a normal bit-field,
1                the type of the zero-length bit-field may affect the
1                alignment of the structure as whole.  For example, 't2'
1                has a size of 4 bytes, since the zero-length bit-field
1                follows a normal bit-field, and is of type short.
1 
1             2. Even if a zero-length bit-field is not followed by a
1                normal bit-field, it may still affect the alignment of
1                the structure:
1 
1                     struct
1                      {
1                        char foo : 6;
1                        long : 0;
1                      } t4;
1 
1                Here, 't4' takes up 4 bytes.
1 
1        3. Zero-length bit-fields following non-bit-field members are
1           ignored:
1 
1                struct
1                 {
1                   char foo;
1                   long : 0;
1                   char bar;
1                 } t5;
1 
1           Here, 't5' takes up 2 bytes.
1 
1 '-mno-align-stringops'
1      Do not align the destination of inlined string operations.  This
1      switch reduces code size and improves performance in case the
1      destination is already aligned, but GCC doesn't know about it.
1 
1 '-minline-all-stringops'
1      By default GCC inlines string operations only when the destination
1      is known to be aligned to least a 4-byte boundary.  This enables
1      more inlining and increases code size, but may improve performance
1      of code that depends on fast 'memcpy', 'strlen', and 'memset' for
1      short lengths.
1 
1 '-minline-stringops-dynamically'
1      For string operations of unknown size, use run-time checks with
1      inline code for small blocks and a library call for large blocks.
1 
1 '-mstringop-strategy=ALG'
1      Override the internal decision heuristic for the particular
1      algorithm to use for inlining string operations.  The allowed
1      values for ALG are:
1 
1      'rep_byte'
1      'rep_4byte'
1      'rep_8byte'
1           Expand using i386 'rep' prefix of the specified size.
1 
1      'byte_loop'
1      'loop'
1      'unrolled_loop'
1           Expand into an inline loop.
1 
1      'libcall'
1           Always use a library call.
1 
1 '-mmemcpy-strategy=STRATEGY'
1      Override the internal decision heuristic to decide if
1      '__builtin_memcpy' should be inlined and what inline algorithm to
1      use when the expected size of the copy operation is known.
1      STRATEGY is a comma-separated list of ALG:MAX_SIZE:DEST_ALIGN
1      triplets.  ALG is specified in '-mstringop-strategy', MAX_SIZE
1      specifies the max byte size with which inline algorithm ALG is
1      allowed.  For the last triplet, the MAX_SIZE must be '-1'.  The
1      MAX_SIZE of the triplets in the list must be specified in
1      increasing order.  The minimal byte size for ALG is '0' for the
1      first triplet and 'MAX_SIZE + 1' of the preceding range.
1 
1 '-mmemset-strategy=STRATEGY'
1      The option is similar to '-mmemcpy-strategy=' except that it is to
1      control '__builtin_memset' expansion.
1 
1 '-momit-leaf-frame-pointer'
1      Don't keep the frame pointer in a register for leaf functions.
1      This avoids the instructions to save, set up, and restore frame
1      pointers and makes an extra register available in leaf functions.
1      The option '-fomit-leaf-frame-pointer' removes the frame pointer
1      for leaf functions, which might make debugging harder.
1 
1 '-mtls-direct-seg-refs'
1 '-mno-tls-direct-seg-refs'
1      Controls whether TLS variables may be accessed with offsets from
1      the TLS segment register ('%gs' for 32-bit, '%fs' for 64-bit), or
1      whether the thread base pointer must be added.  Whether or not this
1      is valid depends on the operating system, and whether it maps the
1      segment to cover the entire TLS area.
1 
1      For systems that use the GNU C Library, the default is on.
1 
1 '-msse2avx'
1 '-mno-sse2avx'
1      Specify that the assembler should encode SSE instructions with VEX
1      prefix.  The option '-mavx' turns this on by default.
1 
1 '-mfentry'
1 '-mno-fentry'
1      If profiling is active ('-pg'), put the profiling counter call
1      before the prologue.  Note: On x86 architectures the attribute
1      'ms_hook_prologue' isn't possible at the moment for '-mfentry' and
1      '-pg'.
1 
1 '-mrecord-mcount'
1 '-mno-record-mcount'
1      If profiling is active ('-pg'), generate a __mcount_loc section
1      that contains pointers to each profiling call.  This is useful for
1      automatically patching and out calls.
1 
1 '-mnop-mcount'
1 '-mno-nop-mcount'
1      If profiling is active ('-pg'), generate the calls to the profiling
1      functions as NOPs.  This is useful when they should be patched in
1      later dynamically.  This is likely only useful together with
1      '-mrecord-mcount'.
1 
1 '-mskip-rax-setup'
1 '-mno-skip-rax-setup'
1      When generating code for the x86-64 architecture with SSE
1      extensions disabled, '-mskip-rax-setup' can be used to skip setting
1      up RAX register when there are no variable arguments passed in
1      vector registers.
1 
1      *Warning:* Since RAX register is used to avoid unnecessarily saving
1      vector registers on stack when passing variable arguments, the
1      impacts of this option are callees may waste some stack space,
1      misbehave or jump to a random location.  GCC 4.4 or newer don't
1      have those issues, regardless the RAX register value.
1 
1 '-m8bit-idiv'
1 '-mno-8bit-idiv'
1      On some processors, like Intel Atom, 8-bit unsigned integer divide
1      is much faster than 32-bit/64-bit integer divide.  This option
1      generates a run-time check.  If both dividend and divisor are
1      within range of 0 to 255, 8-bit unsigned integer divide is used
1      instead of 32-bit/64-bit integer divide.
1 
1 '-mavx256-split-unaligned-load'
1 '-mavx256-split-unaligned-store'
1      Split 32-byte AVX unaligned load and store.
1 
1 '-mstack-protector-guard=GUARD'
1 '-mstack-protector-guard-reg=REG'
1 '-mstack-protector-guard-offset=OFFSET'
1      Generate stack protection code using canary at GUARD.  Supported
1      locations are 'global' for global canary or 'tls' for per-thread
1      canary in the TLS block (the default).  This option has effect only
1      when '-fstack-protector' or '-fstack-protector-all' is specified.
1 
1      With the latter choice the options
1      '-mstack-protector-guard-reg=REG' and
1      '-mstack-protector-guard-offset=OFFSET' furthermore specify which
1      segment register ('%fs' or '%gs') to use as base register for
1      reading the canary, and from what offset from that base register.
1      The default for those is as specified in the relevant ABI.
1 
1 '-mmitigate-rop'
1      Try to avoid generating code sequences that contain unintended
1      return opcodes, to mitigate against certain forms of attack.  At
1      the moment, this option is limited in what it can do and should not
1      be relied on to provide serious protection.
1 
1 '-mgeneral-regs-only'
1      Generate code that uses only the general-purpose registers.  This
1      prevents the compiler from using floating-point, vector, mask and
1      bound registers.
1 
1 '-mindirect-branch=CHOICE'
1      Convert indirect call and jump with CHOICE.  The default is 'keep',
1      which keeps indirect call and jump unmodified.  'thunk' converts
1      indirect call and jump to call and return thunk.  'thunk-inline'
1      converts indirect call and jump to inlined call and return thunk.
1      'thunk-extern' converts indirect call and jump to external call and
1      return thunk provided in a separate object file.  You can control
1      this behavior for a specific function by using the function
1      attribute 'indirect_branch'.  ⇒Function Attributes.
1 
1      Note that '-mcmodel=large' is incompatible with
1      '-mindirect-branch=thunk' and '-mindirect-branch=thunk-extern'
1      since the thunk function may not be reachable in the large code
1      model.
1 
1      Note that '-mindirect-branch=thunk-extern' is incompatible with
1      '-fcf-protection=branch' and '-fcheck-pointer-bounds' since the
1      external thunk can not be modified to disable control-flow check.
1 
1 '-mfunction-return=CHOICE'
1      Convert function return with CHOICE.  The default is 'keep', which
1      keeps function return unmodified.  'thunk' converts function return
1      to call and return thunk.  'thunk-inline' converts function return
1      to inlined call and return thunk.  'thunk-extern' converts function
1      return to external call and return thunk provided in a separate
1      object file.  You can control this behavior for a specific function
11      by using the function attribute 'function_return'.  ⇒Function
      Attributes.
1 
1      Note that '-mcmodel=large' is incompatible with
1      '-mfunction-return=thunk' and '-mfunction-return=thunk-extern'
1      since the thunk function may not be reachable in the large code
1      model.
1 
1 '-mindirect-branch-register'
1      Force indirect call and jump via register.
1 
1 '-mharden-sls=CHOICE'
1      Generate code to mitigate against straight line speculation (SLS)
1      with CHOICE.  The default is 'none' which disables all SLS
1      hardening.  'return' enables SLS hardening for function returns.
1      'indirect-jmp' enables SLS hardening for indirect jumps.  'all'
1      enables all SLS hardening.
1 
1 '-mindirect-branch-cs-prefix'
1      Add CS prefix to call and jmp to indirect thunk with branch target
1      in r8-r15 registers so that the call and jmp instruction length is
1      6 bytes to allow them to be replaced with 'lfence; call *%r8-r15'
1      or 'lfence; jmp *%r8-r15' at run-time.
1 
1  These '-m' switches are supported in addition to the above on x86-64
1 processors in 64-bit environments.
1 
1 '-m32'
1 '-m64'
1 '-mx32'
1 '-m16'
1 '-miamcu'
1      Generate code for a 16-bit, 32-bit or 64-bit environment.  The
1      '-m32' option sets 'int', 'long', and pointer types to 32 bits, and
1      generates code that runs on any i386 system.
1 
1      The '-m64' option sets 'int' to 32 bits and 'long' and pointer
1      types to 64 bits, and generates code for the x86-64 architecture.
1      For Darwin only the '-m64' option also turns off the '-fno-pic' and
1      '-mdynamic-no-pic' options.
1 
1      The '-mx32' option sets 'int', 'long', and pointer types to 32
1      bits, and generates code for the x86-64 architecture.
1 
1      The '-m16' option is the same as '-m32', except for that it outputs
1      the '.code16gcc' assembly directive at the beginning of the
1      assembly output so that the binary can run in 16-bit mode.
1 
1      The '-miamcu' option generates code which conforms to Intel MCU
1      psABI. It requires the '-m32' option to be turned on.
1 
1 '-mno-red-zone'
1      Do not use a so-called "red zone" for x86-64 code.  The red zone is
1      mandated by the x86-64 ABI; it is a 128-byte area beyond the
1      location of the stack pointer that is not modified by signal or
1      interrupt handlers and therefore can be used for temporary data
1      without adjusting the stack pointer.  The flag '-mno-red-zone'
1      disables this red zone.
1 
1 '-mcmodel=small'
1      Generate code for the small code model: the program and its symbols
1      must be linked in the lower 2 GB of the address space.  Pointers
1      are 64 bits.  Programs can be statically or dynamically linked.
1      This is the default code model.
1 
1 '-mcmodel=kernel'
1      Generate code for the kernel code model.  The kernel runs in the
1      negative 2 GB of the address space.  This model has to be used for
1      Linux kernel code.
1 
1 '-mcmodel=medium'
1      Generate code for the medium model: the program is linked in the
1      lower 2 GB of the address space.  Small symbols are also placed
1      there.  Symbols with sizes larger than '-mlarge-data-threshold' are
1      put into large data or BSS sections and can be located above 2GB.
1      Programs can be statically or dynamically linked.
1 
1 '-mcmodel=large'
1      Generate code for the large model.  This model makes no assumptions
1      about addresses and sizes of sections.
1 
1 '-maddress-mode=long'
1      Generate code for long address mode.  This is only supported for
1      64-bit and x32 environments.  It is the default address mode for
1      64-bit environments.
1 
1 '-maddress-mode=short'
1      Generate code for short address mode.  This is only supported for
1      32-bit and x32 environments.  It is the default address mode for
1      32-bit and x32 environments.
1