• Adrian Hunter's avatar
    perf thread-stack: Represent jmps to the start of a different symbol · f08046cb
    Adrian Hunter authored
    The compiler might optimize a call/ret combination by making it a jmp.
    However the thread-stack does not presently cater for that, so that such
    control flow is not visible in the call graph. Make it visible by
    recording on the stack a branch to the start of a different symbol.
    Note, that means when a ret pops the stack, all jmps must be popped off
    first.
    
    Example:
    
      $ cat jmp-to-fn.c
      __attribute__((noinline)) int bar(void)
      {
              return -1;
      }
    
      __attribute__((noinline)) int foo(void)
      {
              return bar() + 1;
      }
    
      int main()
      {
              return foo();
      }
      $ gcc -ggdb3 -Wall -Wextra -O2 -o jmp-to-fn jmp-to-fn.c
      $ objdump -d jmp-to-fn
      <SNIP>
      0000000000001040 <main>:
          1040:       31 c0                   xor    %eax,%eax
          1042:       e9 09 01 00 00          jmpq   1150 <foo>
      <SNIP>
      0000000000001140 <bar>:
          1140:       b8 ff ff ff ff          mov    $0xffffffff,%eax
          1145:       c3                      retq
      <SNIP>
      0000000000001150 <foo>:
          1150:       31 c0                   xor    %eax,%eax
          1152:       e8 e9 ff ff ff          callq  1140 <bar>
          1157:       83 c0 01                add    $0x1,%eax
          115a:       c3                      retq
      <SNIP>
      $ perf record -o jmp-to-fn.perf.data -e intel_pt/cyc/u ./jmp-to-fn
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0,017 MB jmp-to-fn.perf.data ]
      $ perf script -i jmp-to-fn.perf.data --itrace=be -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py jmp-to-fn.db branches calls
      2019-01-08 13:24:58.783069 Creating database...
      2019-01-08 13:24:58.794650 Writing records...
      2019-01-08 13:24:59.008050 Adding indexes
      2019-01-08 13:24:59.015802 Done
      $  ~/libexec/perf-core/scripts/python/exported-sql-viewer.py jmp-to-fn.db
    
    Before:
    
        main
            -> bar
    
    After:
    
        main
            -> foo
                -> bar
    
    Committer testing:
    
    Install the python2-pyside package, then select these menu options
    on the GUI:
    
       "Reports"
          "Context sensitive callgraphs"
    
    Then go on expanding the symbols, to get, full picture when doing this
    on a fedora:29 with gcc version 8.2.1 20181215 (Red Hat 8.2.1-6) (GCC):
    
    jmp-to-fn
      PID:TID
        _start                (ld-2.28.so)
          __libc_start_main
            main
              foo
                bar
    
    To verify that indeed, this fixes the problem.
    Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
    Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
    Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
    Link: http://lkml.kernel.org/r/20190109091835.5570-5-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
    f08046cb
thread-stack.c 18.4 KB