Hi Jiri,

On 2015/11/17 23:05, Jiri Olsa wrote:
hi,
as reported by Milian, currently for DWARF unwind (both libdw
and libunwind) we display callchain in callee order only.

Adding the support to follow callchain order setup to libunwind
DWARF unwinder, so we could get following output for report:

   $ perf record --call-graph dwarf ls
   ...
   $ perf report --no-children --stdio

     39.26%  ls       libc-2.21.so      [.] __strcoll_l
                  |
                  ---__strcoll_l
                     mpsort_with_tmp
                     mpsort_with_tmp
                     sort_files
                     main
                     __libc_start_main
                     _start
                     0

   $ perf report -g caller --no-children --stdio
     ...
     39.26%  ls       libc-2.21.so      [.] __strcoll_l
                  |
                  ---0
                     _start
                     __libc_start_main
                     main
                     sort_files
                     mpsort_with_tmp
                     mpsort_with_tmp
                     __strcoll_l

Tested on x86_64. The change is in generic code only,
so it should not affect other archs. Still it would be
nice to have some confirmation.. Wang Nan? ;-)

It'd be nice to have this for libdw unwind as well,
but it looks like it's out of reach for perf code.. Jan?

Also available in:
   git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
   perf/callchain_1


Thanks for notifying me about this. I have tested it in my environment.

It works well for me except a small behavior changing. Please see below.

Before applying these patch set:

# perf report --no-children --stdio --call-graph=callee
# Overhead  Command  Shared Object     Symbol
# ........  .......  ................  .........................
#
    96.61%  a.out    [vdso]            [.] __vdso_gettimeofday
              |
              ---__vdso_gettimeofday
                 funcc
                 funcb
                 funca
                 main
                 __libc_start_main
                 _start

     3.38%  a.out    a.out             [.] funcc
              |
              ---funcc
                 |
                  --2.70%-- funcb
                            funca
                            main
                            __libc_start_main
                            _start

     0.02%  pref_re  [kernel.vmlinux]  [k] sched_clock
              |
              ---sched_clock
                 perf_event_nmi_handler
                 nmi_handle
     ...

And caller:

# ./perf report --no-children --stdio --call-graph=caller
# Overhead  Command  Shared Object     Symbol
# ........  .......  ................  .........................
#
    96.61%  a.out    [vdso]            [.] __vdso_gettimeofday
              |
              ---__vdso_gettimeofday
                 funcc
                 funcb
                 funca
                 main
                 __libc_start_main
                 _start

     3.38%  a.out    a.out             [.] funcc
              |
              ---funcc
                 |
                  --2.70%-- funcb
                            funca
                            main
                            __libc_start_main
                            _start

     0.02%  pref_re  [kernel.vmlinux]  [k] sched_clock
              |
              ---return_from_execve
                 sys_execve
                 do_execveat_common.isra.27


The user code part of output are identical so I confirm the bug.

After applying this patchset:

# ./perf report --no-children --stdio --call-graph=callee
# Overhead  Command  Shared Object     Symbol
# ........  .......  ................  .........................
#
    96.61%  a.out    [vdso]            [.] __vdso_gettimeofday
              |
              ---__vdso_gettimeofday
                 funcc
                 funcb
                 funca
                 main
                 __libc_start_main
                 _start

     3.38%  a.out    a.out             [.] funcc
              |
              ---funcc
                 |
                 |--2.70%-- funcb
                 |          funca
                 |          main
                 |          __libc_start_main
                 |          _start
                 |
                  --0.68%-- 0
     0.02%  pref_re  [kernel.vmlinux]  [k] sched_clock
              |
              ---sched_clock
                 perf_event_nmi_handler
     ...

And caller:

# ./perf report --no-children --stdio --call-graph=caller
# Overhead  Command  Shared Object     Symbol
# ........  .......  ................  .........................
#
    96.61%  a.out    [vdso]            [.] __vdso_gettimeofday
              |
              ---_start
                 __libc_start_main
                 main
                 funca
                 funcb
                 funcc
                 __vdso_gettimeofday

     3.38%  a.out    a.out             [.] funcc
              |
              |--2.70%-- _start
              |          __libc_start_main
              |          main
              |          funca
              |          funcb
              |          funcc
              |
               --0.68%-- 0
                         funcc

     0.02%  pref_re  [kernel.vmlinux]  [k] sched_clock
              |
              ---return_from_execve
                 sys_execve
    ...

It fixes the bug. However, do you see the extra "0.68%-- 0" in the tree?

I give a message on patch 2/3, please have a look. I think this change
would be okay for me if we treat the old behavior as a bug (for example:
sum of all branches not equal to the overhead of itself). However, the
original code explicitly avoid generating '0' entry so I think we
should make it clear.

Thank you.


thanks,
jirka


Cc: Jan Kratochvil <jkrat...@redhat.com>
---
Jiri Olsa (3):
       perf tools: Move initial entry call into get_entries function
       perf tools: Add callchain order support for libunwind DWARF unwinder
       perf test: Add callchain order setup for DWARF unwinder test

  tools/perf/tests/dwarf-unwind.c    | 22 +++++++++++++++++++---
  tools/perf/util/unwind-libunwind.c | 60 
+++++++++++++++++++++++++++++++++++++++---------------------
  2 files changed, 58 insertions(+), 24 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to