ld: Canonical format
1
1 5.1.2 The BFD canonical object-file format
1 ------------------------------------------
1
1 The greatest potential for loss of information occurs when there is the
1 least overlap between the information provided by the source format,
1 that stored by the canonical format, and that needed by the destination
1 format. A brief description of the canonical form may help you
1 understand which kinds of data you can count on preserving across
1 conversions.
1
1 _files_
1 Information stored on a per-file basis includes target machine
1 architecture, particular implementation format type, a demand
1 pageable bit, and a write protected bit. Information like Unix
1 magic numbers is not stored here--only the magic numbers' meaning,
1 so a 'ZMAGIC' file would have both the demand pageable bit and the
1 write protected text bit set. The byte order of the target is
1 stored on a per-file basis, so that big- and little-endian object
1 files may be used with one another.
1
1 _sections_
1 Each section in the input file contains the name of the section,
1 the section's original address in the object file, size and
1 alignment information, various flags, and pointers into other BFD
1 data structures.
1
1 _symbols_
1 Each symbol contains a pointer to the information for the object
1 file which originally defined it, its name, its value, and various
1 flag bits. When a BFD back end reads in a symbol table, it
1 relocates all symbols to make them relative to the base of the
1 section where they were defined. Doing this ensures that each
1 symbol points to its containing section. Each symbol also has a
1 varying amount of hidden private data for the BFD back end. Since
1 the symbol points to the original file, the private data format for
1 that symbol is accessible. 'ld' can operate on a collection of
1 symbols of wildly different formats without problems.
1
1 Normal global and simple local symbols are maintained on output, so
1 an output file (no matter its format) will retain symbols pointing
1 to functions and to global, static, and common variables. Some
1 symbol information is not worth retaining; in 'a.out', type
1 information is stored in the symbol table as long symbol names.
1 This information would be useless to most COFF debuggers; the
1 linker has command line switches to allow users to throw it away.
1
1 There is one word of type information within the symbol, so if the
1 format supports symbol type information within symbols (for
1 example, COFF, IEEE, Oasys) and the type is simple enough to fit
1 within one word (nearly everything but aggregates), the information
1 will be preserved.
1
1 _relocation level_
1 Each canonical BFD relocation record contains a pointer to the
1 symbol to relocate to, the offset of the data to relocate, the
1 section the data is in, and a pointer to a relocation type
1 descriptor. Relocation is performed by passing messages through
1 the relocation type descriptor and the symbol pointer. Therefore,
1 relocations can be performed on output data using a relocation
1 method that is only available in one of the input formats. For
1 instance, Oasys provides a byte relocation format. A relocation
1 record requesting this relocation type would point indirectly to a
1 routine to perform this, so the relocation may be performed on a
1 byte being written to a 68k COFF file, even though 68k COFF has no
1 such relocation type.
1
1 _line numbers_
1 Object formats can contain, for debugging purposes, some form of
1 mapping between symbols, source line numbers, and addresses in the
1 output file. These addresses have to be relocated along with the
1 symbol information. Each symbol with an associated list of line
1 number records points to the first record of the list. The head of
1 a line number list consists of a pointer to the symbol, which
1 allows finding out the address of the function whose line number is
1 being described. The rest of the list is made up of pairs: offsets
1 into the section and line numbers. Any format which can simply
1 derive this information can pass it successfully between formats
1 (COFF, IEEE and Oasys).
1