gcc: Code Gen Options
1
1 3.16 Options for Code Generation Conventions
1 ============================================
1
1 These machine-independent options control the interface conventions used
1 in code generation.
1
1 Most of them have both positive and negative forms; the negative form
1 of '-ffoo' is '-fno-foo'. In the table below, only one of the forms is
1 listed--the one that is not the default. You can figure out the other
1 form by either removing 'no-' or adding it.
1
1 '-fstack-reuse=REUSE-LEVEL'
1 This option controls stack space reuse for user declared local/auto
1 variables and compiler generated temporaries. REUSE_LEVEL can be
1 'all', 'named_vars', or 'none'. 'all' enables stack reuse for all
1 local variables and temporaries, 'named_vars' enables the reuse
1 only for user defined local variables with names, and 'none'
1 disables stack reuse completely. The default value is 'all'. The
1 option is needed when the program extends the lifetime of a scoped
1 local variable or a compiler generated temporary beyond the end
1 point defined by the language. When a lifetime of a variable ends,
1 and if the variable lives in memory, the optimizing compiler has
1 the freedom to reuse its stack space with other temporaries or
1 scoped local variables whose live range does not overlap with it.
1 Legacy code extending local lifetime is likely to break with the
1 stack reuse optimization.
1
1 For example,
1
1 int *p;
1 {
1 int local1;
1
1 p = &local1;
1 local1 = 10;
1 ....
1 }
1 {
1 int local2;
1 local2 = 20;
1 ...
1 }
1
1 if (*p == 10) // out of scope use of local1
1 {
1
1 }
1
1 Another example:
1
1 struct A
1 {
1 A(int k) : i(k), j(k) { }
1 int i;
1 int j;
1 };
1
1 A *ap;
1
1 void foo(const A& ar)
1 {
1 ap = &ar;
1 }
1
1 void bar()
1 {
1 foo(A(10)); // temp object's lifetime ends when foo returns
1
1 {
1 A a(20);
1 ....
1 }
1 ap->i+= 10; // ap references out of scope temp whose space
1 // is reused with a. What is the value of ap->i?
1 }
1
1
1 The lifetime of a compiler generated temporary is well defined by
1 the C++ standard. When a lifetime of a temporary ends, and if the
1 temporary lives in memory, the optimizing compiler has the freedom
1 to reuse its stack space with other temporaries or scoped local
1 variables whose live range does not overlap with it. However some
1 of the legacy code relies on the behavior of older compilers in
1 which temporaries' stack space is not reused, the aggressive stack
1 reuse can lead to runtime errors. This option is used to control
1 the temporary stack reuse optimization.
1
1 '-ftrapv'
1 This option generates traps for signed overflow on addition,
1 subtraction, multiplication operations. The options '-ftrapv' and
1 '-fwrapv' override each other, so using '-ftrapv' '-fwrapv' on the
1 command-line results in '-fwrapv' being effective. Note that only
1 active options override, so using '-ftrapv' '-fwrapv' '-fno-wrapv'
1 on the command-line results in '-ftrapv' being effective.
1
1 '-fwrapv'
1 This option instructs the compiler to assume that signed arithmetic
1 overflow of addition, subtraction and multiplication wraps around
1 using twos-complement representation. This flag enables some
1 optimizations and disables others. The options '-ftrapv' and
1 '-fwrapv' override each other, so using '-ftrapv' '-fwrapv' on the
1 command-line results in '-fwrapv' being effective. Note that only
1 active options override, so using '-ftrapv' '-fwrapv' '-fno-wrapv'
1 on the command-line results in '-ftrapv' being effective.
1
1 '-fwrapv-pointer'
1 This option instructs the compiler to assume that pointer
1 arithmetic overflow on addition and subtraction wraps around using
1 twos-complement representation. This flag disables some
1 optimizations which assume pointer overflow is invalid.
1
1 '-fstrict-overflow'
1 This option implies '-fno-wrapv' '-fno-wrapv-pointer' and when
1 negated implies '-fwrapv' '-fwrapv-pointer'.
1
1 '-fexceptions'
1 Enable exception handling. Generates extra code needed to
1 propagate exceptions. For some targets, this implies GCC generates
1 frame unwind information for all functions, which can produce
1 significant data size overhead, although it does not affect
1 execution. If you do not specify this option, GCC enables it by
1 default for languages like C++ that normally require exception
1 handling, and disables it for languages like C that do not normally
1 require it. However, you may need to enable this option when
1 compiling C code that needs to interoperate properly with exception
1 handlers written in C++. You may also wish to disable this option
1 if you are compiling older C++ programs that don't use exception
1 handling.
1
1 '-fnon-call-exceptions'
1 Generate code that allows trapping instructions to throw
1 exceptions. Note that this requires platform-specific runtime
1 support that does not exist everywhere. Moreover, it only allows
1 _trapping_ instructions to throw exceptions, i.e. memory references
1 or floating-point instructions. It does not allow exceptions to be
1 thrown from arbitrary signal handlers such as 'SIGALRM'.
1
1 '-fdelete-dead-exceptions'
1 Consider that instructions that may throw exceptions but don't
1 otherwise contribute to the execution of the program can be
1 optimized away. This option is enabled by default for the Ada
1 front end, as permitted by the Ada language specification.
1 Optimization passes that cause dead exceptions to be removed are
1 enabled independently at different optimization levels.
1
1 '-funwind-tables'
1 Similar to '-fexceptions', except that it just generates any needed
1 static data, but does not affect the generated code in any other
1 way. You normally do not need to enable this option; instead, a
1 language processor that needs this handling enables it on your
1 behalf.
1
1 '-fasynchronous-unwind-tables'
1 Generate unwind table in DWARF format, if supported by target
1 machine. The table is exact at each instruction boundary, so it
1 can be used for stack unwinding from asynchronous events (such as
1 debugger or garbage collector).
1
1 '-fno-gnu-unique'
1 On systems with recent GNU assembler and C library, the C++
1 compiler uses the 'STB_GNU_UNIQUE' binding to make sure that
1 definitions of template static data members and static local
1 variables in inline functions are unique even in the presence of
1 'RTLD_LOCAL'; this is necessary to avoid problems with a library
1 used by two different 'RTLD_LOCAL' plugins depending on a
1 definition in one of them and therefore disagreeing with the other
1 one about the binding of the symbol. But this causes 'dlclose' to
1 be ignored for affected DSOs; if your program relies on
1 reinitialization of a DSO via 'dlclose' and 'dlopen', you can use
1 '-fno-gnu-unique'.
1
1 '-fpcc-struct-return'
1 Return "short" 'struct' and 'union' values in memory like longer
1 ones, rather than in registers. This convention is less efficient,
1 but it has the advantage of allowing intercallability between
1 GCC-compiled files and files compiled with other compilers,
1 particularly the Portable C Compiler (pcc).
1
1 The precise convention for returning structures in memory depends
1 on the target configuration macros.
1
1 Short structures and unions are those whose size and alignment
1 match that of some integer type.
1
1 *Warning:* code compiled with the '-fpcc-struct-return' switch is
1 not binary compatible with code compiled with the
1 '-freg-struct-return' switch. Use it to conform to a non-default
1 application binary interface.
1
1 '-freg-struct-return'
1 Return 'struct' and 'union' values in registers when possible.
1 This is more efficient for small structures than
1 '-fpcc-struct-return'.
1
1 If you specify neither '-fpcc-struct-return' nor
1 '-freg-struct-return', GCC defaults to whichever convention is
1 standard for the target. If there is no standard convention, GCC
1 defaults to '-fpcc-struct-return', except on targets where GCC is
1 the principal compiler. In those cases, we can choose the
1 standard, and we chose the more efficient register return
1 alternative.
1
1 *Warning:* code compiled with the '-freg-struct-return' switch is
1 not binary compatible with code compiled with the
1 '-fpcc-struct-return' switch. Use it to conform to a non-default
1 application binary interface.
1
1 '-fshort-enums'
1 Allocate to an 'enum' type only as many bytes as it needs for the
1 declared range of possible values. Specifically, the 'enum' type
1 is equivalent to the smallest integer type that has enough room.
1
1 *Warning:* the '-fshort-enums' switch causes GCC to generate code
1 that is not binary compatible with code generated without that
1 switch. Use it to conform to a non-default application binary
1 interface.
1
1 '-fshort-wchar'
1 Override the underlying type for 'wchar_t' to be 'short unsigned
1 int' instead of the default for the target. This option is useful
1 for building programs to run under WINE.
1
1 *Warning:* the '-fshort-wchar' switch causes GCC to generate code
1 that is not binary compatible with code generated without that
1 switch. Use it to conform to a non-default application binary
1 interface.
1
1 '-fno-common'
1 In C code, this option controls the placement of global variables
1 defined without an initializer, known as "tentative definitions" in
1 the C standard. Tentative definitions are distinct from
1 declarations of a variable with the 'extern' keyword, which do not
1 allocate storage.
1
1 Unix C compilers have traditionally allocated storage for
1 uninitialized global variables in a common block. This allows the
1 linker to resolve all tentative definitions of the same variable in
1 different compilation units to the same object, or to a
1 non-tentative definition. This is the behavior specified by
1 '-fcommon', and is the default for GCC on most targets. On the
1 other hand, this behavior is not required by ISO C, and on some
1 targets may carry a speed or code size penalty on variable
1 references.
1
1 The '-fno-common' option specifies that the compiler should instead
1 place uninitialized global variables in the data section of the
1 object file. This inhibits the merging of tentative definitions by
1 the linker so you get a multiple-definition error if the same
1 variable is defined in more than one compilation unit. Compiling
1 with '-fno-common' is useful on targets for which it provides
1 better performance, or if you wish to verify that the program will
1 work on other systems that always treat uninitialized variable
1 definitions this way.
1
1 '-fno-ident'
1 Ignore the '#ident' directive.
1
1 '-finhibit-size-directive'
1 Don't output a '.size' assembler directive, or anything else that
1 would cause trouble if the function is split in the middle, and the
1 two halves are placed at locations far apart in memory. This
1 option is used when compiling 'crtstuff.c'; you should not need to
1 use it for anything else.
1
1 '-fverbose-asm'
1 Put extra commentary information in the generated assembly code to
1 make it more readable. This option is generally only of use to
1 those who actually need to read the generated assembly code
1 (perhaps while debugging the compiler itself).
1
1 '-fno-verbose-asm', the default, causes the extra information to be
1 omitted and is useful when comparing two assembler files.
1
1 The added comments include:
1
1 * information on the compiler version and command-line options,
1
1 * the source code lines associated with the assembly
1 instructions, in the form FILENAME:LINENUMBER:CONTENT OF LINE,
1
1 * hints on which high-level expressions correspond to the
1 various assembly instruction operands.
1
1 For example, given this C source file:
1
1 int test (int n)
1 {
1 int i;
1 int total = 0;
1
1 for (i = 0; i < n; i++)
1 total += i * i;
1
1 return total;
1 }
1
1 compiling to (x86_64) assembly via '-S' and emitting the result
1 direct to stdout via '-o' '-'
1
1 gcc -S test.c -fverbose-asm -Os -o -
1
1 gives output similar to this:
1
1 .file "test.c"
1 # GNU C11 (GCC) version 7.0.0 20160809 (experimental) (x86_64-pc-linux-gnu)
1 [...snip...]
1 # options passed:
1 [...snip...]
1
1 .text
1 .globl test
1 .type test, @function
1 test:
1 .LFB0:
1 .cfi_startproc
1 # test.c:4: int total = 0;
1 xorl %eax, %eax # <retval>
1 # test.c:6: for (i = 0; i < n; i++)
1 xorl %edx, %edx # i
1 .L2:
1 # test.c:6: for (i = 0; i < n; i++)
1 cmpl %edi, %edx # n, i
1 jge .L5 #,
1 # test.c:7: total += i * i;
1 movl %edx, %ecx # i, tmp92
1 imull %edx, %ecx # i, tmp92
1 # test.c:6: for (i = 0; i < n; i++)
1 incl %edx # i
1 # test.c:7: total += i * i;
1 addl %ecx, %eax # tmp92, <retval>
1 jmp .L2 #
1 .L5:
1 # test.c:10: }
1 ret
1 .cfi_endproc
1 .LFE0:
1 .size test, .-test
1 .ident "GCC: (GNU) 7.0.0 20160809 (experimental)"
1 .section .note.GNU-stack,"",@progbits
1
1 The comments are intended for humans rather than machines and hence
1 the precise format of the comments is subject to change.
1
1 '-frecord-gcc-switches'
1 This switch causes the command line used to invoke the compiler to
1 be recorded into the object file that is being created. This
1 switch is only implemented on some targets and the exact format of
1 the recording is target and binary file format dependent, but it
1 usually takes the form of a section containing ASCII text. This
1 switch is related to the '-fverbose-asm' switch, but that switch
1 only records information in the assembler output file as comments,
1 so it never reaches the object file. See also
1 '-grecord-gcc-switches' for another way of storing compiler options
1 into the object file.
1
1 '-fpic'
1 Generate position-independent code (PIC) suitable for use in a
1 shared library, if supported for the target machine. Such code
1 accesses all constant addresses through a global offset table
1 (GOT). The dynamic loader resolves the GOT entries when the
1 program starts (the dynamic loader is not part of GCC; it is part
1 of the operating system). If the GOT size for the linked
1 executable exceeds a machine-specific maximum size, you get an
1 error message from the linker indicating that '-fpic' does not
1 work; in that case, recompile with '-fPIC' instead. (These
1 maximums are 8k on the SPARC, 28k on AArch64 and 32k on the m68k
1 and RS/6000. The x86 has no such limit.)
1
1 Position-independent code requires special support, and therefore
1 works only on certain machines. For the x86, GCC supports PIC for
1 System V but not for the Sun 386i. Code generated for the IBM
1 RS/6000 is always position-independent.
1
1 When this flag is set, the macros '__pic__' and '__PIC__' are
1 defined to 1.
1
1 '-fPIC'
1 If supported for the target machine, emit position-independent
1 code, suitable for dynamic linking and avoiding any limit on the
1 size of the global offset table. This option makes a difference on
1 AArch64, m68k, PowerPC and SPARC.
1
1 Position-independent code requires special support, and therefore
1 works only on certain machines.
1
1 When this flag is set, the macros '__pic__' and '__PIC__' are
1 defined to 2.
1
1 '-fpie'
1 '-fPIE'
1 These options are similar to '-fpic' and '-fPIC', but generated
1 position independent code can be only linked into executables.
1 Usually these options are used when '-pie' GCC option is used
1 during linking.
1
1 '-fpie' and '-fPIE' both define the macros '__pie__' and '__PIE__'.
1 The macros have the value 1 for '-fpie' and 2 for '-fPIE'.
1
1 '-fno-plt'
1 Do not use the PLT for external function calls in
1 position-independent code. Instead, load the callee address at
1 call sites from the GOT and branch to it. This leads to more
1 efficient code by eliminating PLT stubs and exposing GOT loads to
1 optimizations. On architectures such as 32-bit x86 where PLT stubs
1 expect the GOT pointer in a specific register, this gives more
1 register allocation freedom to the compiler. Lazy binding requires
1 use of the PLT; with '-fno-plt' all external symbols are resolved
1 at load time.
1
1 Alternatively, the function attribute 'noplt' can be used to avoid
1 calls through the PLT for specific external functions.
1
1 In position-dependent code, a few targets also convert calls to
1 functions that are marked to not use the PLT to use the GOT
1 instead.
1
1 '-fno-jump-tables'
1 Do not use jump tables for switch statements even where it would be
1 more efficient than other code generation strategies. This option
1 is of use in conjunction with '-fpic' or '-fPIC' for building code
1 that forms part of a dynamic linker and cannot reference the
1 address of a jump table. On some targets, jump tables do not
1 require a GOT and this option is not needed.
1
1 '-ffixed-REG'
1 Treat the register named REG as a fixed register; generated code
1 should never refer to it (except perhaps as a stack pointer, frame
1 pointer or in some other fixed role).
1
1 REG must be the name of a register. The register names accepted
1 are machine-specific and are defined in the 'REGISTER_NAMES' macro
1 in the machine description macro file.
1
1 This flag does not have a negative form, because it specifies a
1 three-way choice.
1
1 '-fcall-used-REG'
1 Treat the register named REG as an allocable register that is
1 clobbered by function calls. It may be allocated for temporaries
1 or variables that do not live across a call. Functions compiled
1 this way do not save and restore the register REG.
1
1 It is an error to use this flag with the frame pointer or stack
1 pointer. Use of this flag for other registers that have fixed
1 pervasive roles in the machine's execution model produces
1 disastrous results.
1
1 This flag does not have a negative form, because it specifies a
1 three-way choice.
1
1 '-fcall-saved-REG'
1 Treat the register named REG as an allocable register saved by
1 functions. It may be allocated even for temporaries or variables
1 that live across a call. Functions compiled this way save and
1 restore the register REG if they use it.
1
1 It is an error to use this flag with the frame pointer or stack
1 pointer. Use of this flag for other registers that have fixed
1 pervasive roles in the machine's execution model produces
1 disastrous results.
1
1 A different sort of disaster results from the use of this flag for
1 a register in which function values may be returned.
1
1 This flag does not have a negative form, because it specifies a
1 three-way choice.
1
1 '-fpack-struct[=N]'
1 Without a value specified, pack all structure members together
1 without holes. When a value is specified (which must be a small
1 power of two), pack structure members according to this value,
1 representing the maximum alignment (that is, objects with default
1 alignment requirements larger than this are output potentially
1 unaligned at the next fitting location.
1
1 *Warning:* the '-fpack-struct' switch causes GCC to generate code
1 that is not binary compatible with code generated without that
1 switch. Additionally, it makes the code suboptimal. Use it to
1 conform to a non-default application binary interface.
1
1 '-fleading-underscore'
1 This option and its counterpart, '-fno-leading-underscore',
1 forcibly change the way C symbols are represented in the object
1 file. One use is to help link with legacy assembly code.
1
1 *Warning:* the '-fleading-underscore' switch causes GCC to generate
1 code that is not binary compatible with code generated without that
1 switch. Use it to conform to a non-default application binary
1 interface. Not all targets provide complete support for this
1 switch.
1
1 '-ftls-model=MODEL'
11 Alter the thread-local storage model to be used (⇒
Thread-Local). The MODEL argument should be one of
1 'global-dynamic', 'local-dynamic', 'initial-exec' or 'local-exec'.
1 Note that the choice is subject to optimization: the compiler may
1 use a more efficient model for symbols not visible outside of the
1 translation unit, or if '-fpic' is not given on the command line.
1
1 The default without '-fpic' is 'initial-exec'; with '-fpic' the
1 default is 'global-dynamic'.
1
1 '-ftrampolines'
1 For targets that normally need trampolines for nested functions,
1 always generate them instead of using descriptors. Otherwise, for
1 targets that do not need them, like for example HP-PA or IA-64, do
1 nothing.
1
1 A trampoline is a small piece of code that is created at run time
1 on the stack when the address of a nested function is taken, and is
1 used to call the nested function indirectly. Therefore, it
1 requires the stack to be made executable in order for the program
1 to work properly.
1
1 '-fno-trampolines' is enabled by default on a language by language
1 basis to let the compiler avoid generating them, if it computes
1 that this is safe, and replace them with descriptors. Descriptors
1 are made up of data only, but the generated code must be prepared
1 to deal with them. As of this writing, '-fno-trampolines' is
1 enabled by default only for Ada.
1
1 Moreover, code compiled with '-ftrampolines' and code compiled with
1 '-fno-trampolines' are not binary compatible if nested functions
1 are present. This option must therefore be used on a program-wide
1 basis and be manipulated with extreme care.
1
1 '-fvisibility=[default|internal|hidden|protected]'
1 Set the default ELF image symbol visibility to the specified
1 option--all symbols are marked with this unless overridden within
1 the code. Using this feature can very substantially improve
1 linking and load times of shared object libraries, produce more
1 optimized code, provide near-perfect API export and prevent symbol
1 clashes. It is *strongly* recommended that you use this in any
1 shared objects you distribute.
1
1 Despite the nomenclature, 'default' always means public; i.e.,
1 available to be linked against from outside the shared object.
1 'protected' and 'internal' are pretty useless in real-world usage
1 so the only other commonly used option is 'hidden'. The default if
1 '-fvisibility' isn't specified is 'default', i.e., make every
1 symbol public.
1
1 A good explanation of the benefits offered by ensuring ELF symbols
1 have the correct visibility is given by "How To Write Shared
1 Libraries" by Ulrich Drepper (which can be found at
1 <https://www.akkadia.org/drepper/>)--however a superior solution
1 made possible by this option to marking things hidden when the
1 default is public is to make the default hidden and mark things
1 public. This is the norm with DLLs on Windows and with
1 '-fvisibility=hidden' and '__attribute__ ((visibility("default")))'
1 instead of '__declspec(dllexport)' you get almost identical
1 semantics with identical syntax. This is a great boon to those
1 working with cross-platform projects.
1
1 For those adding visibility support to existing code, you may find
1 '#pragma GCC visibility' of use. This works by you enclosing the
1 declarations you wish to set visibility for with (for example)
1 '#pragma GCC visibility push(hidden)' and '#pragma GCC visibility
1 pop'. Bear in mind that symbol visibility should be viewed *as
1 part of the API interface contract* and thus all new code should
1 always specify visibility when it is not the default; i.e.,
1 declarations only for use within the local DSO should *always* be
1 marked explicitly as hidden as so to avoid PLT indirection
1 overheads--making this abundantly clear also aids readability and
1 self-documentation of the code. Note that due to ISO C++
1 specification requirements, 'operator new' and 'operator delete'
1 must always be of default visibility.
1
1 Be aware that headers from outside your project, in particular
1 system headers and headers from any other library you use, may not
1 be expecting to be compiled with visibility other than the default.
1 You may need to explicitly say '#pragma GCC visibility
1 push(default)' before including any such headers.
1
1 'extern' declarations are not affected by '-fvisibility', so a lot
1 of code can be recompiled with '-fvisibility=hidden' with no
1 modifications. However, this means that calls to 'extern'
1 functions with no explicit visibility use the PLT, so it is more
1 effective to use '__attribute ((visibility))' and/or '#pragma GCC
1 visibility' to tell the compiler which 'extern' declarations should
1 be treated as hidden.
1
1 Note that '-fvisibility' does affect C++ vague linkage entities.
1 This means that, for instance, an exception class that is be thrown
1 between DSOs must be explicitly marked with default visibility so
1 that the 'type_info' nodes are unified between the DSOs.
1
1 An overview of these techniques, their benefits and how to use them
1 is at <http://gcc.gnu.org/wiki/Visibility>.
1
1 '-fstrict-volatile-bitfields'
1 This option should be used if accesses to volatile bit-fields (or
1 other structure fields, although the compiler usually honors those
1 types anyway) should use a single access of the width of the
1 field's type, aligned to a natural alignment if possible. For
1 example, targets with memory-mapped peripheral registers might
1 require all such accesses to be 16 bits wide; with this flag you
1 can declare all peripheral bit-fields as 'unsigned short' (assuming
1 short is 16 bits on these targets) to force GCC to use 16-bit
1 accesses instead of, perhaps, a more efficient 32-bit access.
1
1 If this option is disabled, the compiler uses the most efficient
1 instruction. In the previous example, that might be a 32-bit load
1 instruction, even though that accesses bytes that do not contain
1 any portion of the bit-field, or memory-mapped registers unrelated
1 to the one being updated.
1
1 In some cases, such as when the 'packed' attribute is applied to a
1 structure field, it may not be possible to access the field with a
1 single read or write that is correctly aligned for the target
1 machine. In this case GCC falls back to generating multiple
1 accesses rather than code that will fault or truncate the result at
1 run time.
1
1 Note: Due to restrictions of the C/C++11 memory model, write
1 accesses are not allowed to touch non bit-field members. It is
1 therefore recommended to define all bits of the field's type as
1 bit-field members.
1
1 The default value of this option is determined by the application
1 binary interface for the target processor.
1
1 '-fsync-libcalls'
1 This option controls whether any out-of-line instance of the
1 '__sync' family of functions may be used to implement the C++11
1 '__atomic' family of functions.
1
1 The default value of this option is enabled, thus the only useful
1 form of the option is '-fno-sync-libcalls'. This option is used in
1 the implementation of the 'libatomic' runtime library.
1