Hi,

Sorry for late..

On Mon, Sep 24, 2018 at 09:32:11PM +0300, Alexey Budankov wrote:
> On 24.09.2018 17:29, Jiri Olsa wrote:
> > On Mon, Sep 24, 2018 at 04:09:09PM +0300, Alexey Budankov wrote:
> >> Command:
> >>
> >> /usr/bin/time ./perf.thr record --threads=T \
> >>    -N -B -T -R --call-graph dwarf,1024 --user-regs=ip,bp,sp \
> >>    -e cpu/period=P,event=0x3c/Duk,\
> >>       cpu/period=P,umask=0x3/Duk,\
> >>       cpu/period=P,event=0xc0/Duk,\
> >>       cpu/period=0xaae61,event=0xc2,umask=0x10/uk,\
> >>       cpu/period=0x11171,event=0xc2,umask=0x20/uk,\
> >>       cpu/period=0x11171,event=0xc2,umask=0x40/uk \
> >>    --clockid=monotonic_raw -- ./matrix.gcc
> >>
> >> Workload: matrix multiplication in 128 threads
> >>
> >> T : 272
> >>    P (period, ms)       : 0.35 
> >>    runtime overhead (%) : 13x ~ 87.73 / 6.81
> >>    data loss (%)        : 0
> >>    LOST events          : 36
> >>    SAMPLE events        : 8048542
> >>    perf.data size (GiB) : 10
> > 
> > any idea why does it have some much more samples?
> 
> Presumably, this is because period is 350us and this is the smallest 
> one that perf.thr manages to capture data without data loss (=0) when T=272.
> However, during collection, I get message that max sampling frequency 
> is lowered to 3KHz.

And it took much longer than AIO:  87.73 vs 22.34  (N=272)

Thanks,
Namhyung

Reply via email to