gprof: File Format

1 
1 9.2 Profiling Data File Format
1 ==============================
1 
1 The old BSD-derived file format used for profile data does not contain a
1 magic cookie that allows to check whether a data file really is a
1 'gprof' file.  Furthermore, it does not provide a version number, thus
1 rendering changes to the file format almost impossible.  GNU 'gprof'
1 uses a new file format that provides these features.  For backward
1 compatibility, GNU 'gprof' continues to support the old BSD-derived
1 format, but not all features are supported with it.  For example,
1 basic-block execution counts cannot be accommodated by the old file
1 format.
1 
1    The new file format is defined in header file 'gmon_out.h'.  It
1 consists of a header containing the magic cookie and a version number,
1 as well as some spare bytes available for future extensions.  All data
1 in a profile data file is in the native format of the target for which
1 the profile was collected.  GNU 'gprof' adapts automatically to the
1 byte-order in use.
1 
1    In the new file format, the header is followed by a sequence of
1 records.  Currently, there are three different record types: histogram
1 records, call-graph arc records, and basic-block execution count
1 records.  Each file can contain any number of each record type.  When
1 reading a file, GNU 'gprof' will ensure records of the same type are
1 compatible with each other and compute the union of all records.  For
1 example, for basic-block execution counts, the union is simply the sum
1 of all execution counts for each basic-block.
1 
1 9.2.1 Histogram Records
1 -----------------------
1 
1 Histogram records consist of a header that is followed by an array of
1 bins.  The header contains the text-segment range that the histogram
1 spans, the size of the histogram in bytes (unlike in the old BSD format,
1 this does not include the size of the header), the rate of the profiling
1 clock, and the physical dimension that the bin counts represent after
1 being scaled by the profiling clock rate.  The physical dimension is
1 specified in two parts: a long name of up to 15 characters and a single
1 character abbreviation.  For example, a histogram representing real-time
1 would specify the long name as "seconds" and the abbreviation as "s".
1 This feature is useful for architectures that support performance
1 monitor hardware (which, fortunately, is becoming increasingly common).
1 For example, under DEC OSF/1, the "uprofile" command can be used to
1 produce a histogram of, say, instruction cache misses.  In this case,
1 the dimension in the histogram header could be set to "i-cache misses"
1 and the abbreviation could be set to "1" (because it is simply a count,
1 not a physical dimension).  Also, the profiling rate would have to be
1 set to 1 in this case.
1 
1    Histogram bins are 16-bit numbers and each bin represent an equal
1 amount of text-space.  For example, if the text-segment is one thousand
1 bytes long and if there are ten bins in the histogram, each bin
1 represents one hundred bytes.
1 
1 9.2.2 Call-Graph Records
1 ------------------------
1 
1 Call-graph records have a format that is identical to the one used in
1 the BSD-derived file format.  It consists of an arc in the call graph
1 and a count indicating the number of times the arc was traversed during
1 program execution.  Arcs are specified by a pair of addresses: the first
1 must be within caller's function and the second must be within the
1 callee's function.  When performing profiling at the function level,
1 these addresses can point anywhere within the respective function.
1 However, when profiling at the line-level, it is better if the addresses
1 are as close to the call-site/entry-point as possible.  This will ensure
1 that the line-level call-graph is able to identify exactly which line of
1 source code performed calls to a function.
1 
1 9.2.3 Basic-Block Execution Count Records
1 -----------------------------------------
1 
1 Basic-block execution count records consist of a header followed by a
1 sequence of address/count pairs.  The header simply specifies the length
1 of the sequence.  In an address/count pair, the address identifies a
1 basic-block and the count specifies the number of times that basic-block
1 was executed.  Any address within the basic-address can be used.
1