On Fri, Jan 29, 2016 at 6:15 PM, Eric Dumazet <eric.duma...@gmail.com> wrote: > On Fri, 2016-01-29 at 16:47 -0800, Tom Herbert wrote: > >> Hmm, thanks for pointing that out. It looks like this is called by a >> couple drivers as part of pulling up the data to get alignment. I am >> actually surprised they are doing this. Flow dissector is not the >> cheapest function in the world and it seems like a shame to be calling >> it this early and not even getting a hash for the skb. Also, this is >> another instance of touching the data so if we do get to the point of >> trying steering packets without cache miss (another thread); this >> undermines that. Seems like it might be just as well to pull up a >> fixed number of bytes (ixgbe want min of 60 bytes), or don't pull up >> anything avoid the cache miss here and let the stack pull up as >> needed. > > Except that when for example mlx4 is able to pull only the headers > instead of a fixed amount of bytes, we increased the performance, > because GRO could cook packets twice as big for tunneled traffic. > > ( commit cfecec56ae7c7c40f23fbdac04acee027ca3bd66 ) > > 1) Increasing performance by prefetching headers is a separate matter, > completely orthogonal to this. > > 2) Taking the opportunity to also give back the l4 hash as a bonus > has been considered, but tests showed no clear benefit. > > For typical cases where l4 hash is not used (RPS and RFS being not > enabled by default on linux), computing it brings nothing but added > complexity. > Neither is GRE enabled by default in Linux and it is not a typical case. So that patch is an optimization for a very narrow use case that impacts the core data path for everyone. Please at least consider making it configurable.
> >