gcc: x86 Options
1
1 3.18.56 x86 Options
1 -------------------
1
1 These '-m' options are defined for the x86 family of computers.
1
1 '-march=CPU-TYPE'
1 Generate instructions for the machine type CPU-TYPE. In contrast
1 to '-mtune=CPU-TYPE', which merely tunes the generated code for the
1 specified CPU-TYPE, '-march=CPU-TYPE' allows GCC to generate code
1 that may not run at all on processors other than the one indicated.
1 Specifying '-march=CPU-TYPE' implies '-mtune=CPU-TYPE'.
1
1 The choices for CPU-TYPE are:
1
1 'native'
1 This selects the CPU to generate code for at compilation time
1 by determining the processor type of the compiling machine.
1 Using '-march=native' enables all instruction subsets
1 supported by the local machine (hence the result might not run
1 on different machines). Using '-mtune=native' produces code
1 optimized for the local machine under the constraints of the
1 selected instruction set.
1
1 'x86-64'
1 A generic CPU with 64-bit extensions.
1
1 'i386'
1 Original Intel i386 CPU.
1
1 'i486'
1 Intel i486 CPU. (No scheduling is implemented for this chip.)
1
1 'i586'
1 'pentium'
1 Intel Pentium CPU with no MMX support.
1
1 'lakemont'
1 Intel Lakemont MCU, based on Intel Pentium CPU.
1
1 'pentium-mmx'
1 Intel Pentium MMX CPU, based on Pentium core with MMX
1 instruction set support.
1
1 'pentiumpro'
1 Intel Pentium Pro CPU.
1
1 'i686'
1 When used with '-march', the Pentium Pro instruction set is
1 used, so the code runs on all i686 family chips. When used
1 with '-mtune', it has the same meaning as 'generic'.
1
1 'pentium2'
1 Intel Pentium II CPU, based on Pentium Pro core with MMX
1 instruction set support.
1
1 'pentium3'
1 'pentium3m'
1 Intel Pentium III CPU, based on Pentium Pro core with MMX and
1 SSE instruction set support.
1
1 'pentium-m'
1 Intel Pentium M; low-power version of Intel Pentium III CPU
1 with MMX, SSE and SSE2 instruction set support. Used by
1 Centrino notebooks.
1
1 'pentium4'
1 'pentium4m'
1 Intel Pentium 4 CPU with MMX, SSE and SSE2 instruction set
1 support.
1
1 'prescott'
1 Improved version of Intel Pentium 4 CPU with MMX, SSE, SSE2
1 and SSE3 instruction set support.
1
1 'nocona'
1 Improved version of Intel Pentium 4 CPU with 64-bit
1 extensions, MMX, SSE, SSE2 and SSE3 instruction set support.
1
1 'core2'
1 Intel Core 2 CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3
1 and SSSE3 instruction set support.
1
1 'nehalem'
1 Intel Nehalem CPU with 64-bit extensions, MMX, SSE, SSE2,
1 SSE3, SSSE3, SSE4.1, SSE4.2 and POPCNT instruction set
1 support.
1
1 'westmere'
1 Intel Westmere CPU with 64-bit extensions, MMX, SSE, SSE2,
1 SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AES and PCLMUL
1 instruction set support.
1
1 'sandybridge'
1 Intel Sandy Bridge CPU with 64-bit extensions, MMX, SSE, SSE2,
1 SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AES and PCLMUL
1 instruction set support.
1
1 'ivybridge'
1 Intel Ivy Bridge CPU with 64-bit extensions, MMX, SSE, SSE2,
1 SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AES, PCLMUL,
1 FSGSBASE, RDRND and F16C instruction set support.
1
1 'haswell'
1 Intel Haswell CPU with 64-bit extensions, MOVBE, MMX, SSE,
1 SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES,
1 PCLMUL, FSGSBASE, RDRND, FMA, BMI, BMI2 and F16C instruction
1 set support.
1
1 'broadwell'
1 Intel Broadwell CPU with 64-bit extensions, MOVBE, MMX, SSE,
1 SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES,
1 PCLMUL, FSGSBASE, RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX
1 and PREFETCHW instruction set support.
1
1 'skylake'
1 Intel Skylake CPU with 64-bit extensions, MOVBE, MMX, SSE,
1 SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES,
1 PCLMUL, FSGSBASE, RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX,
1 PREFETCHW, CLFLUSHOPT, XSAVEC and XSAVES instruction set
1 support.
1
1 'bonnell'
1 Intel Bonnell CPU with 64-bit extensions, MOVBE, MMX, SSE,
1 SSE2, SSE3 and SSSE3 instruction set support.
1
1 'silvermont'
1 Intel Silvermont CPU with 64-bit extensions, MOVBE, MMX, SSE,
1 SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AES, PCLMUL and
1 RDRND instruction set support.
1
1 'knl'
1 Intel Knight's Landing CPU with 64-bit extensions, MOVBE, MMX,
1 SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2,
1 AES, PCLMUL, FSGSBASE, RDRND, FMA, BMI, BMI2, F16C, RDSEED,
1 ADCX, PREFETCHW, AVX512F, AVX512PF, AVX512ER and AVX512CD
1 instruction set support.
1
1 'knm'
1 Intel Knights Mill CPU with 64-bit extensions, MOVBE, MMX,
1 SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2,
1 AES, PCLMUL, FSGSBASE, RDRND, FMA, BMI, BMI2, F16C, RDSEED,
1 ADCX, PREFETCHW, AVX512F, AVX512PF, AVX512ER, AVX512CD,
1 AVX5124VNNIW, AVX5124FMAPS and AVX512VPOPCNTDQ instruction set
1 support.
1
1 'skylake-avx512'
1 Intel Skylake Server CPU with 64-bit extensions, MOVBE, MMX,
1 SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, PKU, AVX,
1 AVX2, AES, PCLMUL, FSGSBASE, RDRND, FMA, BMI, BMI2, F16C,
1 RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC, XSAVES, AVX512F,
1 CLWB, AVX512VL, AVX512BW, AVX512DQ and AVX512CD instruction
1 set support.
1
1 'cannonlake'
1 Intel Cannonlake Server CPU with 64-bit extensions, MOVBE,
1 MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, PKU, AVX,
1 AVX2, AES, PCLMUL, FSGSBASE, RDRND, FMA, BMI, BMI2, F16C,
1 RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC, XSAVES, AVX512F,
1 AVX512VL, AVX512BW, AVX512DQ, AVX512CD, AVX512VBMI,
1 AVX512IFMA, SHA and UMIP instruction set support.
1
1 'icelake-client'
1 Intel Icelake Client CPU with 64-bit extensions, MOVBE, MMX,
1 SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, PKU, AVX,
1 AVX2, AES, PCLMUL, FSGSBASE, RDRND, FMA, BMI, BMI2, F16C,
1 RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC, XSAVES, AVX512F,
1 AVX512VL, AVX512BW, AVX512DQ, AVX512CD, AVX512VBMI,
1 AVX512IFMA, SHA, CLWB, UMIP, RDPID, GFNI, AVX512VBMI2,
1 AVX512VPOPCNTDQ, AVX512BITALG, AVX512VNNI, VPCLMULQDQ, VAES
1 instruction set support.
1
1 'icelake-server'
1 Intel Icelake Server CPU with 64-bit extensions, MOVBE, MMX,
1 SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, PKU, AVX,
1 AVX2, AES, PCLMUL, FSGSBASE, RDRND, FMA, BMI, BMI2, F16C,
1 RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC, XSAVES, AVX512F,
1 AVX512VL, AVX512BW, AVX512DQ, AVX512CD, AVX512VBMI,
1 AVX512IFMA, SHA, CLWB, UMIP, RDPID, GFNI, AVX512VBMI2,
1 AVX512VPOPCNTDQ, AVX512BITALG, AVX512VNNI, VPCLMULQDQ, VAES,
1 PCONFIG and WBNOINVD instruction set support.
1
1 'k6'
1 AMD K6 CPU with MMX instruction set support.
1
1 'k6-2'
1 'k6-3'
1 Improved versions of AMD K6 CPU with MMX and 3DNow!
1 instruction set support.
1
1 'athlon'
1 'athlon-tbird'
1 AMD Athlon CPU with MMX, 3dNOW!, enhanced 3DNow! and SSE
1 prefetch instructions support.
1
1 'athlon-4'
1 'athlon-xp'
1 'athlon-mp'
1 Improved AMD Athlon CPU with MMX, 3DNow!, enhanced 3DNow! and
1 full SSE instruction set support.
1
1 'k8'
1 'opteron'
1 'athlon64'
1 'athlon-fx'
1 Processors based on the AMD K8 core with x86-64 instruction
1 set support, including the AMD Opteron, Athlon 64, and Athlon
1 64 FX processors. (This supersets MMX, SSE, SSE2, 3DNow!,
1 enhanced 3DNow! and 64-bit instruction set extensions.)
1
1 'k8-sse3'
1 'opteron-sse3'
1 'athlon64-sse3'
1 Improved versions of AMD K8 cores with SSE3 instruction set
1 support.
1
1 'amdfam10'
1 'barcelona'
1 CPUs based on AMD Family 10h cores with x86-64 instruction set
1 support. (This supersets MMX, SSE, SSE2, SSE3, SSE4A, 3DNow!,
1 enhanced 3DNow!, ABM and 64-bit instruction set extensions.)
1
1 'bdver1'
1 CPUs based on AMD Family 15h cores with x86-64 instruction set
1 support. (This supersets FMA4, AVX, XOP, LWP, AES, PCL_MUL,
1 CX16, MMX, SSE, SSE2, SSE3, SSE4A, SSSE3, SSE4.1, SSE4.2, ABM
1 and 64-bit instruction set extensions.)
1 'bdver2'
1 AMD Family 15h core based CPUs with x86-64 instruction set
1 support. (This supersets BMI, TBM, F16C, FMA, FMA4, AVX, XOP,
1 LWP, AES, PCL_MUL, CX16, MMX, SSE, SSE2, SSE3, SSE4A, SSSE3,
1 SSE4.1, SSE4.2, ABM and 64-bit instruction set extensions.)
1 'bdver3'
1 AMD Family 15h core based CPUs with x86-64 instruction set
1 support. (This supersets BMI, TBM, F16C, FMA, FMA4, FSGSBASE,
1 AVX, XOP, LWP, AES, PCL_MUL, CX16, MMX, SSE, SSE2, SSE3,
1 SSE4A, SSSE3, SSE4.1, SSE4.2, ABM and 64-bit instruction set
1 extensions.
1 'bdver4'
1 AMD Family 15h core based CPUs with x86-64 instruction set
1 support. (This supersets BMI, BMI2, TBM, F16C, FMA, FMA4,
1 FSGSBASE, AVX, AVX2, XOP, LWP, AES, PCL_MUL, CX16, MOVBE, MMX,
1 SSE, SSE2, SSE3, SSE4A, SSSE3, SSE4.1, SSE4.2, ABM and 64-bit
1 instruction set extensions.
1
1 'znver1'
1 AMD Family 17h core based CPUs with x86-64 instruction set
1 support. (This supersets BMI, BMI2, F16C, FMA, FSGSBASE, AVX,
1 AVX2, ADCX, RDSEED, MWAITX, SHA, CLZERO, AES, PCL_MUL, CX16,
1 MOVBE, MMX, SSE, SSE2, SSE3, SSE4A, SSSE3, SSE4.1, SSE4.2,
1 ABM, XSAVEC, XSAVES, CLFLUSHOPT, POPCNT, and 64-bit
1 instruction set extensions.
1
1 'btver1'
1 CPUs based on AMD Family 14h cores with x86-64 instruction set
1 support. (This supersets MMX, SSE, SSE2, SSE3, SSSE3, SSE4A,
1 CX16, ABM and 64-bit instruction set extensions.)
1
1 'btver2'
1 CPUs based on AMD Family 16h cores with x86-64 instruction set
1 support. This includes MOVBE, F16C, BMI, AVX, PCL_MUL, AES,
1 SSE4.2, SSE4.1, CX16, ABM, SSE4A, SSSE3, SSE3, SSE2, SSE, MMX
1 and 64-bit instruction set extensions.
1
1 'winchip-c6'
1 IDT WinChip C6 CPU, dealt in same way as i486 with additional
1 MMX instruction set support.
1
1 'winchip2'
1 IDT WinChip 2 CPU, dealt in same way as i486 with additional
1 MMX and 3DNow! instruction set support.
1
1 'c3'
1 VIA C3 CPU with MMX and 3DNow! instruction set support. (No
1 scheduling is implemented for this chip.)
1
1 'c3-2'
1 VIA C3-2 (Nehemiah/C5XL) CPU with MMX and SSE instruction set
1 support. (No scheduling is implemented for this chip.)
1
1 'c7'
1 VIA C7 (Esther) CPU with MMX, SSE, SSE2 and SSE3 instruction
1 set support. (No scheduling is implemented for this chip.)
1
1 'samuel-2'
1 VIA Eden Samuel 2 CPU with MMX and 3DNow! instruction set
1 support. (No scheduling is implemented for this chip.)
1
1 'nehemiah'
1 VIA Eden Nehemiah CPU with MMX and SSE instruction set
1 support. (No scheduling is implemented for this chip.)
1
1 'esther'
1 VIA Eden Esther CPU with MMX, SSE, SSE2 and SSE3 instruction
1 set support. (No scheduling is implemented for this chip.)
1
1 'eden-x2'
1 VIA Eden X2 CPU with x86-64, MMX, SSE, SSE2 and SSE3
1 instruction set support. (No scheduling is implemented for
1 this chip.)
1
1 'eden-x4'
1 VIA Eden X4 CPU with x86-64, MMX, SSE, SSE2, SSE3, SSSE3,
1 SSE4.1, SSE4.2, AVX and AVX2 instruction set support. (No
1 scheduling is implemented for this chip.)
1
1 'nano'
1 Generic VIA Nano CPU with x86-64, MMX, SSE, SSE2, SSE3 and
1 SSSE3 instruction set support. (No scheduling is implemented
1 for this chip.)
1
1 'nano-1000'
1 VIA Nano 1xxx CPU with x86-64, MMX, SSE, SSE2, SSE3 and SSSE3
1 instruction set support. (No scheduling is implemented for
1 this chip.)
1
1 'nano-2000'
1 VIA Nano 2xxx CPU with x86-64, MMX, SSE, SSE2, SSE3 and SSSE3
1 instruction set support. (No scheduling is implemented for
1 this chip.)
1
1 'nano-3000'
1 VIA Nano 3xxx CPU with x86-64, MMX, SSE, SSE2, SSE3, SSSE3 and
1 SSE4.1 instruction set support. (No scheduling is implemented
1 for this chip.)
1
1 'nano-x2'
1 VIA Nano Dual Core CPU with x86-64, MMX, SSE, SSE2, SSE3,
1 SSSE3 and SSE4.1 instruction set support. (No scheduling is
1 implemented for this chip.)
1
1 'nano-x4'
1 VIA Nano Quad Core CPU with x86-64, MMX, SSE, SSE2, SSE3,
1 SSSE3 and SSE4.1 instruction set support. (No scheduling is
1 implemented for this chip.)
1
1 'geode'
1 AMD Geode embedded processor with MMX and 3DNow! instruction
1 set support.
1
1 '-mtune=CPU-TYPE'
1 Tune to CPU-TYPE everything applicable about the generated code,
1 except for the ABI and the set of available instructions. While
1 picking a specific CPU-TYPE schedules things appropriately for that
1 particular chip, the compiler does not generate any code that
1 cannot run on the default machine type unless you use a
1 '-march=CPU-TYPE' option. For example, if GCC is configured for
1 i686-pc-linux-gnu then '-mtune=pentium4' generates code that is
1 tuned for Pentium 4 but still runs on i686 machines.
1
1 The choices for CPU-TYPE are the same as for '-march'. In
1 addition, '-mtune' supports 2 extra choices for CPU-TYPE:
1
1 'generic'
1 Produce code optimized for the most common IA32/AMD64/EM64T
1 processors. If you know the CPU on which your code will run,
1 then you should use the corresponding '-mtune' or '-march'
1 option instead of '-mtune=generic'. But, if you do not know
1 exactly what CPU users of your application will have, then you
1 should use this option.
1
1 As new processors are deployed in the marketplace, the
1 behavior of this option will change. Therefore, if you
1 upgrade to a newer version of GCC, code generation controlled
1 by this option will change to reflect the processors that are
1 most common at the time that version of GCC is released.
1
1 There is no '-march=generic' option because '-march' indicates
1 the instruction set the compiler can use, and there is no
1 generic instruction set applicable to all processors. In
1 contrast, '-mtune' indicates the processor (or, in this case,
1 collection of processors) for which the code is optimized.
1
1 'intel'
1 Produce code optimized for the most current Intel processors,
1 which are Haswell and Silvermont for this version of GCC. If
1 you know the CPU on which your code will run, then you should
1 use the corresponding '-mtune' or '-march' option instead of
1 '-mtune=intel'. But, if you want your application performs
1 better on both Haswell and Silvermont, then you should use
1 this option.
1
1 As new Intel processors are deployed in the marketplace, the
1 behavior of this option will change. Therefore, if you
1 upgrade to a newer version of GCC, code generation controlled
1 by this option will change to reflect the most current Intel
1 processors at the time that version of GCC is released.
1
1 There is no '-march=intel' option because '-march' indicates
1 the instruction set the compiler can use, and there is no
1 common instruction set applicable to all processors. In
1 contrast, '-mtune' indicates the processor (or, in this case,
1 collection of processors) for which the code is optimized.
1
1 '-mcpu=CPU-TYPE'
1 A deprecated synonym for '-mtune'.
1
1 '-mfpmath=UNIT'
1 Generate floating-point arithmetic for selected unit UNIT. The
1 choices for UNIT are:
1
1 '387'
1 Use the standard 387 floating-point coprocessor present on the
1 majority of chips and emulated otherwise. Code compiled with
1 this option runs almost everywhere. The temporary results are
1 computed in 80-bit precision instead of the precision
1 specified by the type, resulting in slightly different results
1 compared to most of other chips. See '-ffloat-store' for more
1 detailed description.
1
1 This is the default choice for non-Darwin x86-32 targets.
1
1 'sse'
1 Use scalar floating-point instructions present in the SSE
1 instruction set. This instruction set is supported by Pentium
1 III and newer chips, and in the AMD line by Athlon-4, Athlon
1 XP and Athlon MP chips. The earlier version of the SSE
1 instruction set supports only single-precision arithmetic,
1 thus the double and extended-precision arithmetic are still
1 done using 387. A later version, present only in Pentium 4
1 and AMD x86-64 chips, supports double-precision arithmetic
1 too.
1
1 For the x86-32 compiler, you must use '-march=CPU-TYPE',
1 '-msse' or '-msse2' switches to enable SSE extensions and make
1 this option effective. For the x86-64 compiler, these
1 extensions are enabled by default.
1
1 The resulting code should be considerably faster in the
1 majority of cases and avoid the numerical instability problems
1 of 387 code, but may break some existing code that expects
1 temporaries to be 80 bits.
1
1 This is the default choice for the x86-64 compiler, Darwin
1 x86-32 targets, and the default choice for x86-32 targets with
1 the SSE2 instruction set when '-ffast-math' is enabled.
1
1 'sse,387'
1 'sse+387'
1 'both'
1 Attempt to utilize both instruction sets at once. This
1 effectively doubles the amount of available registers, and on
1 chips with separate execution units for 387 and SSE the
1 execution resources too. Use this option with care, as it is
1 still experimental, because the GCC register allocator does
1 not model separate functional units well, resulting in
1 unstable performance.
1
1 '-masm=DIALECT'
1 Output assembly instructions using selected DIALECT. Also affects
1 which dialect is used for basic 'asm' (⇒Basic Asm) and
1 extended 'asm' (⇒Extended Asm). Supported choices (in
1 dialect order) are 'att' or 'intel'. The default is 'att'. Darwin
1 does not support 'intel'.
1
1 '-mieee-fp'
1 '-mno-ieee-fp'
1 Control whether or not the compiler uses IEEE floating-point
1 comparisons. These correctly handle the case where the result of a
1 comparison is unordered.
1
1 '-m80387'
1 '-mhard-float'
1 Generate output containing 80387 instructions for floating point.
1
1 '-mno-80387'
1 '-msoft-float'
1 Generate output containing library calls for floating point.
1
1 *Warning:* the requisite libraries are not part of GCC. Normally
1 the facilities of the machine's usual C compiler are used, but this
1 cannot be done directly in cross-compilation. You must make your
1 own arrangements to provide suitable library functions for
1 cross-compilation.
1
1 On machines where a function returns floating-point results in the
1 80387 register stack, some floating-point opcodes may be emitted
1 even if '-msoft-float' is used.
1
1 '-mno-fp-ret-in-387'
1 Do not use the FPU registers for return values of functions.
1
1 The usual calling convention has functions return values of types
1 'float' and 'double' in an FPU register, even if there is no FPU.
1 The idea is that the operating system should emulate an FPU.
1
1 The option '-mno-fp-ret-in-387' causes such values to be returned
1 in ordinary CPU registers instead.
1
1 '-mno-fancy-math-387'
1 Some 387 emulators do not support the 'sin', 'cos' and 'sqrt'
1 instructions for the 387. Specify this option to avoid generating
1 those instructions. This option is the default on OpenBSD and
1 NetBSD. This option is overridden when '-march' indicates that the
1 target CPU always has an FPU and so the instruction does not need
1 emulation. These instructions are not generated unless you also
1 use the '-funsafe-math-optimizations' switch.
1
1 '-malign-double'
1 '-mno-align-double'
1 Control whether GCC aligns 'double', 'long double', and 'long long'
1 variables on a two-word boundary or a one-word boundary. Aligning
1 'double' variables on a two-word boundary produces code that runs
1 somewhat faster on a Pentium at the expense of more memory.
1
1 On x86-64, '-malign-double' is enabled by default.
1
1 *Warning:* if you use the '-malign-double' switch, structures
1 containing the above types are aligned differently than the
1 published application binary interface specifications for the
1 x86-32 and are not binary compatible with structures in code
1 compiled without that switch.
1
1 '-m96bit-long-double'
1 '-m128bit-long-double'
1 These switches control the size of 'long double' type. The x86-32
1 application binary interface specifies the size to be 96 bits, so
1 '-m96bit-long-double' is the default in 32-bit mode.
1
1 Modern architectures (Pentium and newer) prefer 'long double' to be
1 aligned to an 8- or 16-byte boundary. In arrays or structures
1 conforming to the ABI, this is not possible. So specifying
1 '-m128bit-long-double' aligns 'long double' to a 16-byte boundary
1 by padding the 'long double' with an additional 32-bit zero.
1
1 In the x86-64 compiler, '-m128bit-long-double' is the default
1 choice as its ABI specifies that 'long double' is aligned on
1 16-byte boundary.
1
1 Notice that neither of these options enable any extra precision
1 over the x87 standard of 80 bits for a 'long double'.
1
1 *Warning:* if you override the default value for your target ABI,
1 this changes the size of structures and arrays containing 'long
1 double' variables, as well as modifying the function calling
1 convention for functions taking 'long double'. Hence they are not
1 binary-compatible with code compiled without that switch.
1
1 '-mlong-double-64'
1 '-mlong-double-80'
1 '-mlong-double-128'
1 These switches control the size of 'long double' type. A size of
1 64 bits makes the 'long double' type equivalent to the 'double'
1 type. This is the default for 32-bit Bionic C library. A size of
1 128 bits makes the 'long double' type equivalent to the
1 '__float128' type. This is the default for 64-bit Bionic C
1 library.
1
1 *Warning:* if you override the default value for your target ABI,
1 this changes the size of structures and arrays containing 'long
1 double' variables, as well as modifying the function calling
1 convention for functions taking 'long double'. Hence they are not
1 binary-compatible with code compiled without that switch.
1
1 '-malign-data=TYPE'
1 Control how GCC aligns variables. Supported values for TYPE are
1 'compat' uses increased alignment value compatible uses GCC 4.8 and
1 earlier, 'abi' uses alignment value as specified by the psABI, and
1 'cacheline' uses increased alignment value to match the cache line
1 size. 'compat' is the default.
1
1 '-mlarge-data-threshold=THRESHOLD'
1 When '-mcmodel=medium' is specified, data objects larger than
1 THRESHOLD are placed in the large data section. This value must be
1 the same across all objects linked into the binary, and defaults to
1 65535.
1
1 '-mrtd'
1 Use a different function-calling convention, in which functions
1 that take a fixed number of arguments return with the 'ret NUM'
1 instruction, which pops their arguments while returning. This
1 saves one instruction in the caller since there is no need to pop
1 the arguments there.
1
1 You can specify that an individual function is called with this
1 calling sequence with the function attribute 'stdcall'. You can
1 also override the '-mrtd' option by using the function attribute
1 'cdecl'. ⇒Function Attributes.
1
1 *Warning:* this calling convention is incompatible with the one
1 normally used on Unix, so you cannot use it if you need to call
1 libraries compiled with the Unix compiler.
1
1 Also, you must provide function prototypes for all functions that
1 take variable numbers of arguments (including 'printf'); otherwise
1 incorrect code is generated for calls to those functions.
1
1 In addition, seriously incorrect code results if you call a
1 function with too many arguments. (Normally, extra arguments are
1 harmlessly ignored.)
1
1 '-mregparm=NUM'
1 Control how many registers are used to pass integer arguments. By
1 default, no registers are used to pass arguments, and at most 3
1 registers can be used. You can control this behavior for a
11 specific function by using the function attribute 'regparm'. ⇒
Function Attributes.
1
1 *Warning:* if you use this switch, and NUM is nonzero, then you
1 must build all modules with the same value, including any
1 libraries. This includes the system libraries and startup modules.
1
1 '-msseregparm'
1 Use SSE register passing conventions for float and double arguments
1 and return values. You can control this behavior for a specific
11 function by using the function attribute 'sseregparm'. ⇒
Function Attributes.
1
1 *Warning:* if you use this switch then you must build all modules
1 with the same value, including any libraries. This includes the
1 system libraries and startup modules.
1
1 '-mvect8-ret-in-mem'
1 Return 8-byte vectors in memory instead of MMX registers. This is
1 the default on Solaris 8 and 9 and VxWorks to match the ABI of the
1 Sun Studio compilers until version 12. Later compiler versions
1 (starting with Studio 12 Update 1) follow the ABI used by other x86
1 targets, which is the default on Solaris 10 and later. _Only_ use
1 this option if you need to remain compatible with existing code
1 produced by those previous compiler versions or older versions of
1 GCC.
1
1 '-mpc32'
1 '-mpc64'
1 '-mpc80'
1
1 Set 80387 floating-point precision to 32, 64 or 80 bits. When
1 '-mpc32' is specified, the significands of results of
1 floating-point operations are rounded to 24 bits (single
1 precision); '-mpc64' rounds the significands of results of
1 floating-point operations to 53 bits (double precision) and
1 '-mpc80' rounds the significands of results of floating-point
1 operations to 64 bits (extended double precision), which is the
1 default. When this option is used, floating-point operations in
1 higher precisions are not available to the programmer without
1 setting the FPU control word explicitly.
1
1 Setting the rounding of floating-point operations to less than the
1 default 80 bits can speed some programs by 2% or more. Note that
1 some mathematical libraries assume that extended-precision (80-bit)
1 floating-point operations are enabled by default; routines in such
1 libraries could suffer significant loss of accuracy, typically
1 through so-called "catastrophic cancellation", when this option is
1 used to set the precision to less than extended precision.
1
1 '-mstackrealign'
1 Realign the stack at entry. On the x86, the '-mstackrealign'
1 option generates an alternate prologue and epilogue that realigns
1 the run-time stack if necessary. This supports mixing legacy codes
1 that keep 4-byte stack alignment with modern codes that keep
1 16-byte stack alignment for SSE compatibility. See also the
1 attribute 'force_align_arg_pointer', applicable to individual
1 functions.
1
1 '-mpreferred-stack-boundary=NUM'
1 Attempt to keep the stack boundary aligned to a 2 raised to NUM
1 byte boundary. If '-mpreferred-stack-boundary' is not specified,
1 the default is 4 (16 bytes or 128 bits).
1
1 *Warning:* When generating code for the x86-64 architecture with
1 SSE extensions disabled, '-mpreferred-stack-boundary=3' can be used
1 to keep the stack boundary aligned to 8 byte boundary. Since
1 x86-64 ABI require 16 byte stack alignment, this is ABI
1 incompatible and intended to be used in controlled environment
1 where stack space is important limitation. This option leads to
1 wrong code when functions compiled with 16 byte stack alignment
1 (such as functions from a standard library) are called with
1 misaligned stack. In this case, SSE instructions may lead to
1 misaligned memory access traps. In addition, variable arguments
1 are handled incorrectly for 16 byte aligned objects (including x87
1 long double and __int128), leading to wrong results. You must
1 build all modules with '-mpreferred-stack-boundary=3', including
1 any libraries. This includes the system libraries and startup
1 modules.
1
1 '-mincoming-stack-boundary=NUM'
1 Assume the incoming stack is aligned to a 2 raised to NUM byte
1 boundary. If '-mincoming-stack-boundary' is not specified, the one
1 specified by '-mpreferred-stack-boundary' is used.
1
1 On Pentium and Pentium Pro, 'double' and 'long double' values
1 should be aligned to an 8-byte boundary (see '-malign-double') or
1 suffer significant run time performance penalties. On Pentium III,
1 the Streaming SIMD Extension (SSE) data type '__m128' may not work
1 properly if it is not 16-byte aligned.
1
1 To ensure proper alignment of this values on the stack, the stack
1 boundary must be as aligned as that required by any value stored on
1 the stack. Further, every function must be generated such that it
1 keeps the stack aligned. Thus calling a function compiled with a
1 higher preferred stack boundary from a function compiled with a
1 lower preferred stack boundary most likely misaligns the stack. It
1 is recommended that libraries that use callbacks always use the
1 default setting.
1
1 This extra alignment does consume extra stack space, and generally
1 increases code size. Code that is sensitive to stack space usage,
1 such as embedded systems and operating system kernels, may want to
1 reduce the preferred alignment to '-mpreferred-stack-boundary=2'.
1
1 '-mmmx'
1 '-msse'
1 '-msse2'
1 '-msse3'
1 '-mssse3'
1 '-msse4'
1 '-msse4a'
1 '-msse4.1'
1 '-msse4.2'
1 '-mavx'
1 '-mavx2'
1 '-mavx512f'
1 '-mavx512pf'
1 '-mavx512er'
1 '-mavx512cd'
1 '-mavx512vl'
1 '-mavx512bw'
1 '-mavx512dq'
1 '-mavx512ifma'
1 '-mavx512vbmi'
1 '-msha'
1 '-maes'
1 '-mpclmul'
1 '-mclflushopt'
1 '-mclwb'
1 '-mfsgsbase'
1 '-mrdrnd'
1 '-mf16c'
1 '-mfma'
1 '-mpconfig'
1 '-mwbnoinvd'
1 '-mfma4'
1 '-mprfchw'
1 '-mrdpid'
1 '-mprefetchwt1'
1 '-mrdseed'
1 '-msgx'
1 '-mxop'
1 '-mlwp'
1 '-m3dnow'
1 '-m3dnowa'
1 '-mpopcnt'
1 '-mabm'
1 '-madx'
1 '-mbmi'
1 '-mbmi2'
1 '-mlzcnt'
1 '-mfxsr'
1 '-mxsave'
1 '-mxsaveopt'
1 '-mxsavec'
1 '-mxsaves'
1 '-mrtm'
1 '-mhle'
1 '-mtbm'
1 '-mmpx'
1 '-mmwaitx'
1 '-mclzero'
1 '-mpku'
1 '-mavx512vbmi2'
1 '-mgfni'
1 '-mvaes'
1 '-mvpclmulqdq'
1 '-mavx512bitalg'
1 '-mmovdiri'
1 '-mmovdir64b'
1 '-mavx512vpopcntdq'
1 '-mavx5124fmaps'
1 '-mavx512vnni'
1 '-mavx5124vnniw'
1 These switches enable the use of instructions in the MMX, SSE,
1 SSE2, SSE3, SSSE3, SSE4, SSE4A, SSE4.1, SSE4.2, AVX, AVX2, AVX512F,
1 AVX512PF, AVX512ER, AVX512CD, AVX512VL, AVX512BW, AVX512DQ,
1 AVX512IFMA, AVX512VBMI, SHA, AES, PCLMUL, CLFLUSHOPT, CLWB,
1 FSGSBASE, RDRND, F16C, FMA, PCONFIG, WBNOINVD, FMA4, PREFETCHW,
1 RDPID, PREFETCHWT1, RDSEED, SGX, XOP, LWP, 3DNow!, enhanced 3DNow!,
1 POPCNT, ABM, ADX, BMI, BMI2, LZCNT, FXSR, XSAVE, XSAVEOPT, XSAVEC,
1 XSAVES, RTM, HLE, TBM, MPX, MWAITX, CLZERO, PKU, AVX512VBMI2, GFNI,
1 VAES, VPCLMULQDQ, AVX512BITALG, MOVDIRI, MOVDIR64B,
1 AVX512VPOPCNTDQ, AVX5124FMAPS, AVX512VNNI, or AVX5124VNNIW extended
1 instruction sets. Each has a corresponding '-mno-' option to
1 disable use of these instructions.
1
1 These extensions are also available as built-in functions: see
1 ⇒x86 Built-in Functions, for details of the functions
1 enabled and disabled by these switches.
1
1 To generate SSE/SSE2 instructions automatically from floating-point
1 code (as opposed to 387 instructions), see '-mfpmath=sse'.
1
1 GCC depresses SSEx instructions when '-mavx' is used. Instead, it
1 generates new AVX instructions or AVX equivalence for all SSEx
1 instructions when needed.
1
1 These options enable GCC to use these extended instructions in
1 generated code, even without '-mfpmath=sse'. Applications that
1 perform run-time CPU detection must compile separate files for each
1 supported architecture, using the appropriate flags. In
1 particular, the file containing the CPU detection code should be
1 compiled without these options.
1
1 '-mdump-tune-features'
1 This option instructs GCC to dump the names of the x86 performance
1 tuning features and default settings. The names can be used in
1 '-mtune-ctrl=FEATURE-LIST'.
1
1 '-mtune-ctrl=FEATURE-LIST'
1 This option is used to do fine grain control of x86 code generation
1 features. FEATURE-LIST is a comma separated list of FEATURE names.
1 See also '-mdump-tune-features'. When specified, the FEATURE is
1 turned on if it is not preceded with '^', otherwise, it is turned
1 off. '-mtune-ctrl=FEATURE-LIST' is intended to be used by GCC
1 developers. Using it may lead to code paths not covered by testing
1 and can potentially result in compiler ICEs or runtime errors.
1
1 '-mno-default'
1 This option instructs GCC to turn off all tunable features. See
1 also '-mtune-ctrl=FEATURE-LIST' and '-mdump-tune-features'.
1
1 '-mcld'
1 This option instructs GCC to emit a 'cld' instruction in the
1 prologue of functions that use string instructions. String
1 instructions depend on the DF flag to select between autoincrement
1 or autodecrement mode. While the ABI specifies the DF flag to be
1 cleared on function entry, some operating systems violate this
1 specification by not clearing the DF flag in their exception
1 dispatchers. The exception handler can be invoked with the DF flag
1 set, which leads to wrong direction mode when string instructions
1 are used. This option can be enabled by default on 32-bit x86
1 targets by configuring GCC with the '--enable-cld' configure
1 option. Generation of 'cld' instructions can be suppressed with
1 the '-mno-cld' compiler option in this case.
1
1 '-mvzeroupper'
1 This option instructs GCC to emit a 'vzeroupper' instruction before
1 a transfer of control flow out of the function to minimize the AVX
1 to SSE transition penalty as well as remove unnecessary 'zeroupper'
1 intrinsics.
1
1 '-mprefer-avx128'
1 This option instructs GCC to use 128-bit AVX instructions instead
1 of 256-bit AVX instructions in the auto-vectorizer.
1
1 '-mprefer-vector-width=OPT'
1 This option instructs GCC to use OPT-bit vector width in
1 instructions instead of default on the selected platform.
1
1 'none'
1 No extra limitations applied to GCC other than defined by the
1 selected platform.
1
1 '128'
1 Prefer 128-bit vector width for instructions.
1
1 '256'
1 Prefer 256-bit vector width for instructions.
1
1 '512'
1 Prefer 512-bit vector width for instructions.
1
1 '-mcx16'
1 This option enables GCC to generate 'CMPXCHG16B' instructions in
1 64-bit code to implement compare-and-exchange operations on 16-byte
1 aligned 128-bit objects. This is useful for atomic updates of data
1 structures exceeding one machine word in size. The compiler uses
1 this instruction to implement ⇒__sync Builtins. However,
1 for ⇒__atomic Builtins operating on 128-bit integers, a
1 library call is always used.
1
1 '-msahf'
1 This option enables generation of 'SAHF' instructions in 64-bit
1 code. Early Intel Pentium 4 CPUs with Intel 64 support, prior to
1 the introduction of Pentium 4 G1 step in December 2005, lacked the
1 'LAHF' and 'SAHF' instructions which are supported by AMD64. These
1 are load and store instructions, respectively, for certain status
1 flags. In 64-bit mode, the 'SAHF' instruction is used to optimize
11 'fmod', 'drem', and 'remainder' built-in functions; see ⇒Other
Builtins for details.
1
1 '-mmovbe'
1 This option enables use of the 'movbe' instruction to implement
1 '__builtin_bswap32' and '__builtin_bswap64'.
1
1 '-mshstk'
1 The '-mshstk' option enables shadow stack built-in functions from
1 x86 Control-flow Enforcement Technology (CET).
1
1 '-mcrc32'
1 This option enables built-in functions '__builtin_ia32_crc32qi',
1 '__builtin_ia32_crc32hi', '__builtin_ia32_crc32si' and
1 '__builtin_ia32_crc32di' to generate the 'crc32' machine
1 instruction.
1
1 '-mrecip'
1 This option enables use of 'RCPSS' and 'RSQRTSS' instructions (and
1 their vectorized variants 'RCPPS' and 'RSQRTPS') with an additional
1 Newton-Raphson step to increase precision instead of 'DIVSS' and
1 'SQRTSS' (and their vectorized variants) for single-precision
1 floating-point arguments. These instructions are generated only
1 when '-funsafe-math-optimizations' is enabled together with
1 '-ffinite-math-only' and '-fno-trapping-math'. Note that while the
1 throughput of the sequence is higher than the throughput of the
1 non-reciprocal instruction, the precision of the sequence can be
1 decreased by up to 2 ulp (i.e. the inverse of 1.0 equals
1 0.99999994).
1
1 Note that GCC implements '1.0f/sqrtf(X)' in terms of 'RSQRTSS' (or
1 'RSQRTPS') already with '-ffast-math' (or the above option
1 combination), and doesn't need '-mrecip'.
1
1 Also note that GCC emits the above sequence with additional
1 Newton-Raphson step for vectorized single-float division and
1 vectorized 'sqrtf(X)' already with '-ffast-math' (or the above
1 option combination), and doesn't need '-mrecip'.
1
1 '-mrecip=OPT'
1 This option controls which reciprocal estimate instructions may be
1 used. OPT is a comma-separated list of options, which may be
1 preceded by a '!' to invert the option:
1
1 'all'
1 Enable all estimate instructions.
1
1 'default'
1 Enable the default instructions, equivalent to '-mrecip'.
1
1 'none'
1 Disable all estimate instructions, equivalent to '-mno-recip'.
1
1 'div'
1 Enable the approximation for scalar division.
1
1 'vec-div'
1 Enable the approximation for vectorized division.
1
1 'sqrt'
1 Enable the approximation for scalar square root.
1
1 'vec-sqrt'
1 Enable the approximation for vectorized square root.
1
1 So, for example, '-mrecip=all,!sqrt' enables all of the reciprocal
1 approximations, except for square root.
1
1 '-mveclibabi=TYPE'
1 Specifies the ABI type to use for vectorizing intrinsics using an
1 external library. Supported values for TYPE are 'svml' for the
1 Intel short vector math library and 'acml' for the AMD math core
1 library. To use this option, both '-ftree-vectorize' and
1 '-funsafe-math-optimizations' have to be enabled, and an SVML or
1 ACML ABI-compatible library must be specified at link time.
1
1 GCC currently emits calls to 'vmldExp2', 'vmldLn2', 'vmldLog102',
1 'vmldPow2', 'vmldTanh2', 'vmldTan2', 'vmldAtan2', 'vmldAtanh2',
1 'vmldCbrt2', 'vmldSinh2', 'vmldSin2', 'vmldAsinh2', 'vmldAsin2',
1 'vmldCosh2', 'vmldCos2', 'vmldAcosh2', 'vmldAcos2', 'vmlsExp4',
1 'vmlsLn4', 'vmlsLog104', 'vmlsPow4', 'vmlsTanh4', 'vmlsTan4',
1 'vmlsAtan4', 'vmlsAtanh4', 'vmlsCbrt4', 'vmlsSinh4', 'vmlsSin4',
1 'vmlsAsinh4', 'vmlsAsin4', 'vmlsCosh4', 'vmlsCos4', 'vmlsAcosh4'
1 and 'vmlsAcos4' for corresponding function type when
1 '-mveclibabi=svml' is used, and '__vrd2_sin', '__vrd2_cos',
1 '__vrd2_exp', '__vrd2_log', '__vrd2_log2', '__vrd2_log10',
1 '__vrs4_sinf', '__vrs4_cosf', '__vrs4_expf', '__vrs4_logf',
1 '__vrs4_log2f', '__vrs4_log10f' and '__vrs4_powf' for the
1 corresponding function type when '-mveclibabi=acml' is used.
1
1 '-mabi=NAME'
1 Generate code for the specified calling convention. Permissible
1 values are 'sysv' for the ABI used on GNU/Linux and other systems,
1 and 'ms' for the Microsoft ABI. The default is to use the Microsoft
1 ABI when targeting Microsoft Windows and the SysV ABI on all other
1 systems. You can control this behavior for specific functions by
11 using the function attributes 'ms_abi' and 'sysv_abi'. ⇒
Function Attributes.
1
1 '-mforce-indirect-call'
1 Force all calls to functions to be indirect. This is useful when
1 using Intel Processor Trace where it generates more precise timing
1 information for function calls.
1
1 '-mcall-ms2sysv-xlogues'
1 Due to differences in 64-bit ABIs, any Microsoft ABI function that
1 calls a System V ABI function must consider RSI, RDI and XMM6-15 as
1 clobbered. By default, the code for saving and restoring these
1 registers is emitted inline, resulting in fairly lengthy prologues
1 and epilogues. Using '-mcall-ms2sysv-xlogues' emits prologues and
1 epilogues that use stubs in the static portion of libgcc to perform
1 these saves and restores, thus reducing function size at the cost
1 of a few extra instructions.
1
1 '-mtls-dialect=TYPE'
1 Generate code to access thread-local storage using the 'gnu' or
1 'gnu2' conventions. 'gnu' is the conservative default; 'gnu2' is
1 more efficient, but it may add compile- and run-time requirements
1 that cannot be satisfied on all systems.
1
1 '-mpush-args'
1 '-mno-push-args'
1 Use PUSH operations to store outgoing parameters. This method is
1 shorter and usually equally fast as method using SUB/MOV operations
1 and is enabled by default. In some cases disabling it may improve
1 performance because of improved scheduling and reduced
1 dependencies.
1
1 '-maccumulate-outgoing-args'
1 If enabled, the maximum amount of space required for outgoing
1 arguments is computed in the function prologue. This is faster on
1 most modern CPUs because of reduced dependencies, improved
1 scheduling and reduced stack usage when the preferred stack
1 boundary is not equal to 2. The drawback is a notable increase in
1 code size. This switch implies '-mno-push-args'.
1
1 '-mthreads'
1 Support thread-safe exception handling on MinGW. Programs that rely
1 on thread-safe exception handling must compile and link all code
1 with the '-mthreads' option. When compiling, '-mthreads' defines
1 '-D_MT'; when linking, it links in a special thread helper library
1 '-lmingwthrd' which cleans up per-thread exception-handling data.
1
1 '-mms-bitfields'
1 '-mno-ms-bitfields'
1
1 Enable/disable bit-field layout compatible with the native
1 Microsoft Windows compiler.
1
1 If 'packed' is used on a structure, or if bit-fields are used, it
1 may be that the Microsoft ABI lays out the structure differently
1 than the way GCC normally does. Particularly when moving packed
1 data between functions compiled with GCC and the native Microsoft
1 compiler (either via function call or as data in a file), it may be
1 necessary to access either format.
1
1 This option is enabled by default for Microsoft Windows targets.
1 This behavior can also be controlled locally by use of variable or
11 type attributes. For more information, see ⇒x86 Variable
Attributes and ⇒x86 Type Attributes.
1
1 The Microsoft structure layout algorithm is fairly simple with the
1 exception of the bit-field packing. The padding and alignment of
1 members of structures and whether a bit-field can straddle a
1 storage-unit boundary are determine by these rules:
1
1 1. Structure members are stored sequentially in the order in
1 which they are declared: the first member has the lowest
1 memory address and the last member the highest.
1
1 2. Every data object has an alignment requirement. The alignment
1 requirement for all data except structures, unions, and arrays
1 is either the size of the object or the current packing size
1 (specified with either the 'aligned' attribute or the 'pack'
1 pragma), whichever is less. For structures, unions, and
1 arrays, the alignment requirement is the largest alignment
1 requirement of its members. Every object is allocated an
1 offset so that:
1
1 offset % alignment_requirement == 0
1
1 3. Adjacent bit-fields are packed into the same 1-, 2-, or 4-byte
1 allocation unit if the integral types are the same size and if
1 the next bit-field fits into the current allocation unit
1 without crossing the boundary imposed by the common alignment
1 requirements of the bit-fields.
1
1 MSVC interprets zero-length bit-fields in the following ways:
1
1 1. If a zero-length bit-field is inserted between two bit-fields
1 that are normally coalesced, the bit-fields are not coalesced.
1
1 For example:
1
1 struct
1 {
1 unsigned long bf_1 : 12;
1 unsigned long : 0;
1 unsigned long bf_2 : 12;
1 } t1;
1
1 The size of 't1' is 8 bytes with the zero-length bit-field.
1 If the zero-length bit-field were removed, 't1''s size would
1 be 4 bytes.
1
1 2. If a zero-length bit-field is inserted after a bit-field,
1 'foo', and the alignment of the zero-length bit-field is
1 greater than the member that follows it, 'bar', 'bar' is
1 aligned as the type of the zero-length bit-field.
1
1 For example:
1
1 struct
1 {
1 char foo : 4;
1 short : 0;
1 char bar;
1 } t2;
1
1 struct
1 {
1 char foo : 4;
1 short : 0;
1 double bar;
1 } t3;
1
1 For 't2', 'bar' is placed at offset 2, rather than offset 1.
1 Accordingly, the size of 't2' is 4. For 't3', the zero-length
1 bit-field does not affect the alignment of 'bar' or, as a
1 result, the size of the structure.
1
1 Taking this into account, it is important to note the
1 following:
1
1 1. If a zero-length bit-field follows a normal bit-field,
1 the type of the zero-length bit-field may affect the
1 alignment of the structure as whole. For example, 't2'
1 has a size of 4 bytes, since the zero-length bit-field
1 follows a normal bit-field, and is of type short.
1
1 2. Even if a zero-length bit-field is not followed by a
1 normal bit-field, it may still affect the alignment of
1 the structure:
1
1 struct
1 {
1 char foo : 6;
1 long : 0;
1 } t4;
1
1 Here, 't4' takes up 4 bytes.
1
1 3. Zero-length bit-fields following non-bit-field members are
1 ignored:
1
1 struct
1 {
1 char foo;
1 long : 0;
1 char bar;
1 } t5;
1
1 Here, 't5' takes up 2 bytes.
1
1 '-mno-align-stringops'
1 Do not align the destination of inlined string operations. This
1 switch reduces code size and improves performance in case the
1 destination is already aligned, but GCC doesn't know about it.
1
1 '-minline-all-stringops'
1 By default GCC inlines string operations only when the destination
1 is known to be aligned to least a 4-byte boundary. This enables
1 more inlining and increases code size, but may improve performance
1 of code that depends on fast 'memcpy', 'strlen', and 'memset' for
1 short lengths.
1
1 '-minline-stringops-dynamically'
1 For string operations of unknown size, use run-time checks with
1 inline code for small blocks and a library call for large blocks.
1
1 '-mstringop-strategy=ALG'
1 Override the internal decision heuristic for the particular
1 algorithm to use for inlining string operations. The allowed
1 values for ALG are:
1
1 'rep_byte'
1 'rep_4byte'
1 'rep_8byte'
1 Expand using i386 'rep' prefix of the specified size.
1
1 'byte_loop'
1 'loop'
1 'unrolled_loop'
1 Expand into an inline loop.
1
1 'libcall'
1 Always use a library call.
1
1 '-mmemcpy-strategy=STRATEGY'
1 Override the internal decision heuristic to decide if
1 '__builtin_memcpy' should be inlined and what inline algorithm to
1 use when the expected size of the copy operation is known.
1 STRATEGY is a comma-separated list of ALG:MAX_SIZE:DEST_ALIGN
1 triplets. ALG is specified in '-mstringop-strategy', MAX_SIZE
1 specifies the max byte size with which inline algorithm ALG is
1 allowed. For the last triplet, the MAX_SIZE must be '-1'. The
1 MAX_SIZE of the triplets in the list must be specified in
1 increasing order. The minimal byte size for ALG is '0' for the
1 first triplet and 'MAX_SIZE + 1' of the preceding range.
1
1 '-mmemset-strategy=STRATEGY'
1 The option is similar to '-mmemcpy-strategy=' except that it is to
1 control '__builtin_memset' expansion.
1
1 '-momit-leaf-frame-pointer'
1 Don't keep the frame pointer in a register for leaf functions.
1 This avoids the instructions to save, set up, and restore frame
1 pointers and makes an extra register available in leaf functions.
1 The option '-fomit-leaf-frame-pointer' removes the frame pointer
1 for leaf functions, which might make debugging harder.
1
1 '-mtls-direct-seg-refs'
1 '-mno-tls-direct-seg-refs'
1 Controls whether TLS variables may be accessed with offsets from
1 the TLS segment register ('%gs' for 32-bit, '%fs' for 64-bit), or
1 whether the thread base pointer must be added. Whether or not this
1 is valid depends on the operating system, and whether it maps the
1 segment to cover the entire TLS area.
1
1 For systems that use the GNU C Library, the default is on.
1
1 '-msse2avx'
1 '-mno-sse2avx'
1 Specify that the assembler should encode SSE instructions with VEX
1 prefix. The option '-mavx' turns this on by default.
1
1 '-mfentry'
1 '-mno-fentry'
1 If profiling is active ('-pg'), put the profiling counter call
1 before the prologue. Note: On x86 architectures the attribute
1 'ms_hook_prologue' isn't possible at the moment for '-mfentry' and
1 '-pg'.
1
1 '-mrecord-mcount'
1 '-mno-record-mcount'
1 If profiling is active ('-pg'), generate a __mcount_loc section
1 that contains pointers to each profiling call. This is useful for
1 automatically patching and out calls.
1
1 '-mnop-mcount'
1 '-mno-nop-mcount'
1 If profiling is active ('-pg'), generate the calls to the profiling
1 functions as NOPs. This is useful when they should be patched in
1 later dynamically. This is likely only useful together with
1 '-mrecord-mcount'.
1
1 '-mskip-rax-setup'
1 '-mno-skip-rax-setup'
1 When generating code for the x86-64 architecture with SSE
1 extensions disabled, '-mskip-rax-setup' can be used to skip setting
1 up RAX register when there are no variable arguments passed in
1 vector registers.
1
1 *Warning:* Since RAX register is used to avoid unnecessarily saving
1 vector registers on stack when passing variable arguments, the
1 impacts of this option are callees may waste some stack space,
1 misbehave or jump to a random location. GCC 4.4 or newer don't
1 have those issues, regardless the RAX register value.
1
1 '-m8bit-idiv'
1 '-mno-8bit-idiv'
1 On some processors, like Intel Atom, 8-bit unsigned integer divide
1 is much faster than 32-bit/64-bit integer divide. This option
1 generates a run-time check. If both dividend and divisor are
1 within range of 0 to 255, 8-bit unsigned integer divide is used
1 instead of 32-bit/64-bit integer divide.
1
1 '-mavx256-split-unaligned-load'
1 '-mavx256-split-unaligned-store'
1 Split 32-byte AVX unaligned load and store.
1
1 '-mstack-protector-guard=GUARD'
1 '-mstack-protector-guard-reg=REG'
1 '-mstack-protector-guard-offset=OFFSET'
1 Generate stack protection code using canary at GUARD. Supported
1 locations are 'global' for global canary or 'tls' for per-thread
1 canary in the TLS block (the default). This option has effect only
1 when '-fstack-protector' or '-fstack-protector-all' is specified.
1
1 With the latter choice the options
1 '-mstack-protector-guard-reg=REG' and
1 '-mstack-protector-guard-offset=OFFSET' furthermore specify which
1 segment register ('%fs' or '%gs') to use as base register for
1 reading the canary, and from what offset from that base register.
1 The default for those is as specified in the relevant ABI.
1
1 '-mmitigate-rop'
1 Try to avoid generating code sequences that contain unintended
1 return opcodes, to mitigate against certain forms of attack. At
1 the moment, this option is limited in what it can do and should not
1 be relied on to provide serious protection.
1
1 '-mgeneral-regs-only'
1 Generate code that uses only the general-purpose registers. This
1 prevents the compiler from using floating-point, vector, mask and
1 bound registers.
1
1 '-mindirect-branch=CHOICE'
1 Convert indirect call and jump with CHOICE. The default is 'keep',
1 which keeps indirect call and jump unmodified. 'thunk' converts
1 indirect call and jump to call and return thunk. 'thunk-inline'
1 converts indirect call and jump to inlined call and return thunk.
1 'thunk-extern' converts indirect call and jump to external call and
1 return thunk provided in a separate object file. You can control
1 this behavior for a specific function by using the function
1 attribute 'indirect_branch'. ⇒Function Attributes.
1
1 Note that '-mcmodel=large' is incompatible with
1 '-mindirect-branch=thunk' and '-mindirect-branch=thunk-extern'
1 since the thunk function may not be reachable in the large code
1 model.
1
1 Note that '-mindirect-branch=thunk-extern' is incompatible with
1 '-fcf-protection=branch' and '-fcheck-pointer-bounds' since the
1 external thunk can not be modified to disable control-flow check.
1
1 '-mfunction-return=CHOICE'
1 Convert function return with CHOICE. The default is 'keep', which
1 keeps function return unmodified. 'thunk' converts function return
1 to call and return thunk. 'thunk-inline' converts function return
1 to inlined call and return thunk. 'thunk-extern' converts function
1 return to external call and return thunk provided in a separate
1 object file. You can control this behavior for a specific function
11 by using the function attribute 'function_return'. ⇒Function
Attributes.
1
1 Note that '-mcmodel=large' is incompatible with
1 '-mfunction-return=thunk' and '-mfunction-return=thunk-extern'
1 since the thunk function may not be reachable in the large code
1 model.
1
1 '-mindirect-branch-register'
1 Force indirect call and jump via register.
1
1 '-mharden-sls=CHOICE'
1 Generate code to mitigate against straight line speculation (SLS)
1 with CHOICE. The default is 'none' which disables all SLS
1 hardening. 'return' enables SLS hardening for function returns.
1 'indirect-jmp' enables SLS hardening for indirect jumps. 'all'
1 enables all SLS hardening.
1
1 '-mindirect-branch-cs-prefix'
1 Add CS prefix to call and jmp to indirect thunk with branch target
1 in r8-r15 registers so that the call and jmp instruction length is
1 6 bytes to allow them to be replaced with 'lfence; call *%r8-r15'
1 or 'lfence; jmp *%r8-r15' at run-time.
1
1 These '-m' switches are supported in addition to the above on x86-64
1 processors in 64-bit environments.
1
1 '-m32'
1 '-m64'
1 '-mx32'
1 '-m16'
1 '-miamcu'
1 Generate code for a 16-bit, 32-bit or 64-bit environment. The
1 '-m32' option sets 'int', 'long', and pointer types to 32 bits, and
1 generates code that runs on any i386 system.
1
1 The '-m64' option sets 'int' to 32 bits and 'long' and pointer
1 types to 64 bits, and generates code for the x86-64 architecture.
1 For Darwin only the '-m64' option also turns off the '-fno-pic' and
1 '-mdynamic-no-pic' options.
1
1 The '-mx32' option sets 'int', 'long', and pointer types to 32
1 bits, and generates code for the x86-64 architecture.
1
1 The '-m16' option is the same as '-m32', except for that it outputs
1 the '.code16gcc' assembly directive at the beginning of the
1 assembly output so that the binary can run in 16-bit mode.
1
1 The '-miamcu' option generates code which conforms to Intel MCU
1 psABI. It requires the '-m32' option to be turned on.
1
1 '-mno-red-zone'
1 Do not use a so-called "red zone" for x86-64 code. The red zone is
1 mandated by the x86-64 ABI; it is a 128-byte area beyond the
1 location of the stack pointer that is not modified by signal or
1 interrupt handlers and therefore can be used for temporary data
1 without adjusting the stack pointer. The flag '-mno-red-zone'
1 disables this red zone.
1
1 '-mcmodel=small'
1 Generate code for the small code model: the program and its symbols
1 must be linked in the lower 2 GB of the address space. Pointers
1 are 64 bits. Programs can be statically or dynamically linked.
1 This is the default code model.
1
1 '-mcmodel=kernel'
1 Generate code for the kernel code model. The kernel runs in the
1 negative 2 GB of the address space. This model has to be used for
1 Linux kernel code.
1
1 '-mcmodel=medium'
1 Generate code for the medium model: the program is linked in the
1 lower 2 GB of the address space. Small symbols are also placed
1 there. Symbols with sizes larger than '-mlarge-data-threshold' are
1 put into large data or BSS sections and can be located above 2GB.
1 Programs can be statically or dynamically linked.
1
1 '-mcmodel=large'
1 Generate code for the large model. This model makes no assumptions
1 about addresses and sizes of sections.
1
1 '-maddress-mode=long'
1 Generate code for long address mode. This is only supported for
1 64-bit and x32 environments. It is the default address mode for
1 64-bit environments.
1
1 '-maddress-mode=short'
1 Generate code for short address mode. This is only supported for
1 32-bit and x32 environments. It is the default address mode for
1 32-bit and x32 environments.
1