gprof: Line-by-line

1 
1 5.3 Line-by-line Profiling
1 ==========================
1 
1 'gprof''s '-l' option causes the program to perform "line-by-line"
1 profiling.  In this mode, histogram samples are assigned not to
1 functions, but to individual lines of source code.  This only works with
1 programs compiled with older versions of the 'gcc' compiler.  Newer
1 versions of 'gcc' use a different program - 'gcov' - to display
1 line-by-line profiling information.
1 
1    With the older versions of 'gcc' the program usually has to be
1 compiled with a '-g' option, in addition to '-pg', in order to generate
1 debugging symbols for tracking source code lines.  Note, in much older
1 versions of 'gcc' the program had to be compiled with the '-a' command
1 line option as well.
1 
1    The flat profile is the most useful output table in line-by-line
1 mode.  The call graph isn't as useful as normal, since the current
1 version of 'gprof' does not propagate call graph arcs from source code
1 lines to the enclosing function.  The call graph does, however, show
1 each line of code that called each function, along with a count.
1 
1    Here is a section of 'gprof''s output, without line-by-line
1 profiling.  Note that 'ct_init' accounted for four histogram hits, and
1 13327 calls to 'init_block'.
1 
1      Flat profile:
1 
1      Each sample counts as 0.01 seconds.
1        %   cumulative   self              self     total
1       time   seconds   seconds    calls  us/call  us/call  name
1       30.77      0.13     0.04     6335     6.31     6.31  ct_init
1 
1 
1      		     Call graph (explanation follows)
1 
1 
1      granularity: each sample hit covers 4 byte(s) for 7.69% of 0.13 seconds
1 
1      index % time    self  children    called     name
1 
1                      0.00    0.00       1/13496       name_too_long
1                      0.00    0.00      40/13496       deflate
1                      0.00    0.00     128/13496       deflate_fast
1                      0.00    0.00   13327/13496       ct_init
1      [7]      0.0    0.00    0.00   13496         init_block
1 
1 
1    Now let's look at some of 'gprof''s output from the same program run,
1 this time with line-by-line profiling enabled.  Note that 'ct_init''s
1 four histogram hits are broken down into four lines of source code--one
1 hit occurred on each of lines 349, 351, 382 and 385.  In the call graph,
1 note how 'ct_init''s 13327 calls to 'init_block' are broken down into
1 one call from line 396, 3071 calls from line 384, 3730 calls from line
1 385, and 6525 calls from 387.
1 
1      Flat profile:
1 
1      Each sample counts as 0.01 seconds.
1        %   cumulative   self
1       time   seconds   seconds    calls  name
1        7.69      0.10     0.01           ct_init (trees.c:349)
1        7.69      0.11     0.01           ct_init (trees.c:351)
1        7.69      0.12     0.01           ct_init (trees.c:382)
1        7.69      0.13     0.01           ct_init (trees.c:385)
1 
1 
1      		     Call graph (explanation follows)
1 
1 
1      granularity: each sample hit covers 4 byte(s) for 7.69% of 0.13 seconds
1 
1        % time    self  children    called     name
1 
1                  0.00    0.00       1/13496       name_too_long (gzip.c:1440)
1                  0.00    0.00       1/13496       deflate (deflate.c:763)
1                  0.00    0.00       1/13496       ct_init (trees.c:396)
1                  0.00    0.00       2/13496       deflate (deflate.c:727)
1                  0.00    0.00       4/13496       deflate (deflate.c:686)
1                  0.00    0.00       5/13496       deflate (deflate.c:675)
1                  0.00    0.00      12/13496       deflate (deflate.c:679)
1                  0.00    0.00      16/13496       deflate (deflate.c:730)
1                  0.00    0.00     128/13496       deflate_fast (deflate.c:654)
1                  0.00    0.00    3071/13496       ct_init (trees.c:384)
1                  0.00    0.00    3730/13496       ct_init (trees.c:385)
1                  0.00    0.00    6525/13496       ct_init (trees.c:387)
1      [6]  0.0    0.00    0.00   13496         init_block (trees.c:408)
1 
1