Em Mon, Feb 01, 2016 at 10:53:29AM +0200, Adrian Hunter escreveu: > On 01/02/16 05:21, Wang Nan wrote: > > Following segfault can happen with a non-root user: > > > > $ ./perf record -I -e intel_pt/tsc=1,noretcomp=1/u /bin/ls > > WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted, > > check /proc/sys/kernel/kptr_restrict. > > > > Samples in kernel functions may not be resolved if a suitable vmlinux > > file is not found in the buildid cache or in the vmlinux path. > > > > Samples in kernel modules won't be resolved at all. > > > > If some relocation was applied (e.g. kexec) symbols may be misresolved > > even with a suitable vmlinux or kallsyms file. > > > > Segmentation fault (core dumped) > > > > The error is in tracepoint_error: it assumes 'e' is valid. > > > > However, there are many situation a parse_event can be called without > > parse_events_error. See result of > > 'grep 'parse_events(.*NULL)' ./tools/perf/ -r'. > > > > This patch makes tracepoint_error() directly return when !e. > > I sent the same fix here: > > http://marc.info/?l=linux-kernel&m=145381056111871
Yeah, I couldn't reproduce it, but we narrowed that down to: machine with Intel PT, without perf_event_attr.context_switch, non-root user, i.e. user can't access the debugfs events info, rebooted my new machine with: [root@jouet ~]# uname -r 4.2.3-300.fc23.x86_64 And, as root, all works because it can read the debugfs events info, to get the "sched:sched_switch" infoa: [root@jouet ~]# perf record -I -e intel_pt/tsc=1,noretcomp=1/u /bin/ls 0 a anaconda-ks.cfg bin GBPCEFwr64.tar-from-deb perf.data perf.data.old perf-f23-bringup.todo [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.214 MB perf.data ] [root@jouet ~]# perf evlist intel_pt/tsc=1,noretcomp=1/u sched:sched_switch dummy:u # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint # events [root@jouet ~]# But not as a non priledged user: (gdb) run record -I -e intel_pt/tsc=1,noretcomp=1/u /bin/ls Starting program: record -I -e intel_pt/tsc=1,noretcomp=1/u /bin/ls No executable file specified. Use the "file" or "exec-file" command. (gdb) file perf Reading symbols from perf...done. (gdb) run record -I -e intel_pt/tsc=1,noretcomp=1/u /bin/ls Starting program: /home/acme/bin/perf record -I -e intel_pt/tsc=1,noretcomp=1/u /bin/ls Missing separate debuginfos, use: dnf debuginfo-install glibc-2.22-7.fc23.x86_64 [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Program received signal SIGSEGV, Segmentation fault. 0x00000000004b9ea5 in tracepoint_error (e=0x0, err=13, sys=0x19b1370 "sched", name=0x19a5d00 "sched_switch") at util/parse-events.c:410 410 e->str = strdup("can't access trace events"); Missing separate debuginfos, use: dnf debuginfo-install audit-libs-2.4.5-1.fc23.x86_64 bzip2-libs-1.0.6-18.fc23.x86_64 elfutils-libelf-0.165-2.fc23.x86_64 elfutils-libs-0.165-2.fc23.x86_64 libunwind-1.1-10.fc23.x86_64 nss-softokn-freebl-3.21.0-1.1.fc23.x86_64 numactl-libs-2.0.10-3.fc23.x86_64 perl-libs-5.22.1-350.fc23.x86_64 python-libs-2.7.10-8.fc23.x86_64 slang-2.3.0-4.fc23.x86_64 xz-libs-5.2.1-3.fc23.x86_64 zlib-1.2.8-9.fc23.x86_64 (gdb) bt #0 0x00000000004b9ea5 in tracepoint_error (e=0x0, err=13, sys=0x19b1370 "sched", name=0x19a5d00 "sched_switch") at util/parse-events.c:410 #1 0x00000000004b9fc5 in add_tracepoint (list=0x19a5d20, idx=0x7fffffffb8c0, sys_name=0x19b1370 "sched", evt_name=0x19a5d00 "sched_switch", err=0x0, head_config=0x0) at util/parse-events.c:433 #2 0x00000000004ba334 in add_tracepoint_event (list=0x19a5d20, idx=0x7fffffffb8c0, sys_name=0x19b1370 "sched", evt_name=0x19a5d00 "sched_switch", err=0x0, head_config=0x0) at util/parse-events.c:498 #3 0x00000000004bb699 in parse_events_add_tracepoint (list=0x19a5d20, idx=0x7fffffffb8c0, sys=0x19b1370 "sched", event=0x19a5d00 "sched_switch", err=0x0, head_config=0x0) at util/parse-events.c:936 #4 0x00000000004f6eda in parse_events_parse (_data=0x7fffffffb8b0, scanner=0x19a49d0) at util/parse-events.y:391 #5 0x00000000004bc8e5 in parse_events__scanner (str=0x663ff2 "sched:sched_switch", data=0x7fffffffb8b0, start_token=258) at util/parse-events.c:1361 #6 0x00000000004bca57 in parse_events (evlist=0x19a5220, str=0x663ff2 "sched:sched_switch", err=0x0) at util/parse-events.c:1401 #7 0x0000000000518d5f in perf_evlist__can_select_event (evlist=0x19a3b90, str=0x663ff2 "sched:sched_switch") at util/record.c:253 #8 0x0000000000553c42 in intel_pt_track_switches (evlist=0x19a3b90) at arch/x86/util/intel-pt.c:364 #9 0x00000000005549d1 in intel_pt_recording_options (itr=0x19a2c40, evlist=0x19a3b90, opts=0x8edf68 <record+232>) at arch/x86/util/intel-pt.c:664 #10 0x000000000051e076 in auxtrace_record__options (itr=0x19a2c40, evlist=0x19a3b90, opts=0x8edf68 <record+232>) at util/auxtrace.c:539 #11 0x0000000000433368 in cmd_record (argc=1, argv=0x7fffffffde60, prefix=0x0) at builtin-record.c:1264 #12 0x000000000049bec2 in run_builtin (p=0x8fa2a8 <commands+168>, argc=5, argv=0x7fffffffde60) at perf.c:390 #13 0x000000000049c12a in handle_internal_command (argc=5, argv=0x7fffffffde60) at perf.c:451 #14 0x000000000049c278 in run_argv (argcp=0x7fffffffdcbc, argv=0x7fffffffdcb0) at perf.c:495 #15 0x000000000049c60a in main (argc=5, argv=0x7fffffffde60) at perf.c:618 (gdb) I am applying Adrian's original patch, adding the above explanations and parts of Wang's, the one about grep showing that that parameter can ben NULL while the function doesn't check it. - Arnaldo > > Signed-off-by: Wang Nan <wangn...@huawei.com> > > Cc: Adrian Hunter <adrian.hun...@intel.com> > > Cc: Arnaldo Carvalho de Melo <a...@redhat.com> > > Cc: Tong Zhang <zt...@vt.edu> > > Cc: Josh Poimboeuf <jpoim...@redhat.com> > > --- > > tools/perf/util/parse-events.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c > > index 4f7b0ef..813d9b2 100644 > > --- a/tools/perf/util/parse-events.c > > +++ b/tools/perf/util/parse-events.c > > @@ -399,6 +399,9 @@ static void tracepoint_error(struct parse_events_error > > *e, int err, > > { > > char help[BUFSIZ]; > > > > + if (!e) > > + return; > > + > > /* > > * We get error directly from syscall errno ( > 0), > > * or from encoded pointer's error ( < 0). > >