Hi Jiri and David, On Thu, 26 Sep 2013 19:51:05 +0200, Jiri Olsa wrote: > On Sun, Sep 22, 2013 at 08:05:59PM -0600, David Ahern wrote: >> When recording raw_syscalls for the entire system, e.g., >> perf record -e raw_syscalls:*,sched:sched_switch -a -- sleep 1 >> >> you end up with a negative feedback loop as perf itself calls >> write() fairly often. This patch mmap's the file in chunks of 64M >> at a time and copies events from the event buffers to the file >> avoiding write system calls. > > moved processing into userspace: > > 17.24% -17.10% libpthread-2.15.so [.] __write_nocancel > > ... > 0.07% +0.64% libc-2.15.so [.] __memcpy_sse2 > > 0.02% +51.84% libc-2.15.so [.] __memcpy_ssse3_back > > 0.01% +0.34% libc-2.15.so [.] __mempcpy_sse2 > > ... >> >> Before (with write syscall): >> >> perf record -o /tmp/perf.data -e raw_syscalls:*,sched:sched_switch -a -- >> sleep 1 >> [ perf record: Woken up 0 times to write data ] >> [ perf record: Captured and wrote 81.843 MB /tmp/perf.data (~3575786 >> samples) ] >> >> After (using mmap): >> >> perf record -o /tmp/perf.data -e raw_syscalls:*,sched:sched_switch -a -- >> sleep 1 >> [ perf record: Woken up 31 times to write data ] > > ^^^^^^^^ > but it's still faster, since we finally get perf a chance to sleep ;-) > > new time: > real 0m30.392s > user 0m0.041s > sys 0m0.389s > > old time: > real 0m32.235s > user 0m3.080s > sys 0m14.444s
But why the new user time took so short? I guess it should take at least 10 seconds or so. Any ideas? > > >> [ perf record: Captured and wrote 8.203 MB /tmp/perf.data (~358388 samples) ] >> >> Before I get too far down this path I wanted to get comments on the approach. > > I think it's worthwhile doing this Indeed! It looks like a nice improvement. Thanks, Namhyung -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/