gccint: WHOPR

1 
1 25.4 Whole program assumptions, linker plugin and symbol visibilities
1 =====================================================================
1 
1 Link-time optimization gives relatively minor benefits when used alone.
1 The problem is that propagation of inter-procedural information does not
1 work well across functions and variables that are called or referenced
1 by other compilation units (such as from a dynamically linked library).
1 We say that such functions and variables are _externally visible_.
1 
1  To make the situation even more difficult, many applications organize
1 themselves as a set of shared libraries, and the default ELF visibility
1 rules allow one to overwrite any externally visible symbol with a
1 different symbol at runtime.  This basically disables any optimizations
1 across such functions and variables, because the compiler cannot be sure
1 that the function body it is seeing is the same function body that will
1 be used at runtime.  Any function or variable not declared 'static' in
1 the sources degrades the quality of inter-procedural optimization.
1 
1  To avoid this problem the compiler must assume that it sees the whole
1 program when doing link-time optimization.  Strictly speaking, the whole
1 program is rarely visible even at link-time.  Standard system libraries
1 are usually linked dynamically or not provided with the link-time
1 information.  In GCC, the whole program option ('-fwhole-program')
1 asserts that every function and variable defined in the current
1 compilation unit is static, except for function 'main' (note: at link
1 time, the current unit is the union of all objects compiled with LTO).
1 Since some functions and variables need to be referenced externally, for
1 example by another DSO or from an assembler file, GCC also provides the
1 function and variable attribute 'externally_visible' which can be used
1 to disable the effect of '-fwhole-program' on a specific symbol.
1 
1  The whole program mode assumptions are slightly more complex in C++,
1 where inline functions in headers are put into _COMDAT_ sections.
1 COMDAT function and variables can be defined by multiple object files
1 and their bodies are unified at link-time and dynamic link-time.  COMDAT
1 functions are changed to local only when their address is not taken and
1 thus un-sharing them with a library is not harmful.  COMDAT variables
1 always remain externally visible, however for readonly variables it is
1 assumed that their initializers cannot be overwritten by a different
1 value.
1 
1  GCC provides the function and variable attribute 'visibility' that can
1 be used to specify the visibility of externally visible symbols (or
1 alternatively an '-fdefault-visibility' command line option).  ELF
1 defines the 'default', 'protected', 'hidden' and 'internal'
1 visibilities.
1 
1  The most commonly used is visibility is 'hidden'.  It specifies that
1 the symbol cannot be referenced from outside of the current shared
1 library.  Unfortunately, this information cannot be used directly by the
1 link-time optimization in the compiler since the whole shared library
1 also might contain non-LTO objects and those are not visible to the
1 compiler.
1 
1  GCC solves this problem using linker plugins.  A _linker plugin_ is an
1 interface to the linker that allows an external program to claim the
1 ownership of a given object file.  The linker then performs the linking
1 procedure by querying the plugin about the symbol table of the claimed
1 objects and once the linking decisions are complete, the plugin is
1 allowed to provide the final object file before the actual linking is
1 made.  The linker plugin obtains the symbol resolution information which
1 specifies which symbols provided by the claimed objects are bound from
1 the rest of a binary being linked.
1 
1  GCC is designed to be independent of the rest of the toolchain and aims
1 to support linkers without plugin support.  For this reason it does not
1 use the linker plugin by default.  Instead, the object files are
1 examined by 'collect2' before being passed to the linker and objects
1 found to have LTO sections are passed to 'lto1' first.  This mode does
1 not work for library archives.  The decision on what object files from
1 the archive are needed depends on the actual linking and thus GCC would
1 have to implement the linker itself.  The resolution information is
1 missing too and thus GCC needs to make an educated guess based on
1 '-fwhole-program'.  Without the linker plugin GCC also assumes that
1 symbols are declared 'hidden' and not referred by non-LTO code by
1 default.
1