Hi, On Wed, 2017-06-07 at 10:10 -0400, David Miller wrote: > From: Paolo Abeni <pab...@redhat.com> > Date: Wed, 07 Jun 2017 09:56:45 +0200 > > > Hi David, > > > > On Tue, 2017-06-06 at 16:23 +0200, Paolo Abeni wrote: > >> when udp_recvmsg() is executed, on x86_64 and other archs, most skb > >> fields are on cold cachelines. > >> If the skb are linear and the kernel don't need to compute the udp > >> csum, only a handful of skb fields are required by udp_recvmsg(). > >> Since we already use skb->dev_scratch to cache hot data, and > >> there are 32 bits unused on 64 bit archs, use such field to cache > >> as much data as we can, and try to prefetch on dequeue the relevant > >> fields that are left out. > >> > >> This can save up to 2 cache miss per packet. > >> > >> v1 -> v2: > >> - changed udp_dev_scratch fields types to u{32,16} variant, > >> replaced bitfield with bool > >> > >> Signed-off-by: Paolo Abeni <pab...@redhat.com> > > > > Can you please keep on-hold this series a little time? the lkp-robot > > just reported a performance regression on v1 which I have still to > > investigate. I can't look at it really soon, but I expect the same > > should apply to v2. > > > > It sounds quite weird to me, since the bisected patch touches the UDP > > code only and the regression is on apachebench. > > Hmmm, DNS lookups? > > Thanks for looking into this.
I spent a little time trying to reproduce the regression. There are not DNS requests during the test, because it's done against the loopback address (verified with perf probe on UDP code). I collected several samples for both the patched and vanilla kernels, and I measured a lot of variance (while using the same kernel) - well above 21% - and a similar results distribution when comparing vanilla to patched kernel. I notified the lkp ML of the above, and I think this is actually a test-suite artifact. I'll re-submit v3 unchanged, if there are no objections. Cheers, Paolo