* Andy Lutomirski <l...@amacapital.net> wrote: > On Mon, Dec 7, 2015 at 1:51 PM, Andy Lutomirski <l...@kernel.org> wrote: > > > This is kind of like the 32-bit and compat code, except that I preserved > > the > > fast path this time. I was unable to measure any significant performance > > change on my laptop in the fast path. > > > > What do you all think? > > For completeness, if I zap the fast path entirely (see attached), I lose 20 > cycles (148 cycles vs 128 cycles) on Skylake. Switching between movq and > pushq > for stack setup makes no difference whatsoever, interestingly. I haven't > tried > to figure out exactly where those 20 cycles go.
So I asked for this before, and I'll do so again: could you please stick the cycle granular system call performance test into a 'perf bench' variant so that: 1) More people can run it all on various pieces of hardware and help out quantify the patches. 2) We can keep an eye on not regressing base system call performance in the future, with a good in-tree testcase. Thanks!! Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/