On Mon, 6 May 2019 16:51:57 -0700
Jakub Kicinski <jakub.kicin...@netronome.com> wrote:

> On Sun,  5 May 2019 13:36:06 +0300, Tariq Toukan wrote:
> > Many device drivers use the same prefetch code structure to
> > deal with small L1 cacheline size.
> > Take this code into a function and call it from the drivers.
> > 
> > Suggested-by: Jakub Kicinski <jakub.kicin...@netronome.com>
> > Signed-off-by: Tariq Toukan <tar...@mellanox.com>
> > Reviewed-by: Saeed Mahameed <sae...@mellanox.com>
> > Cc: Jesper Dangaard Brouer <bro...@redhat.com>  
> 
> We could bike shed on the name a little - net_prefetch_headers() ?
> but at least a short kdoc explanation for the purpose of this helper
> would be good IMHO.

I would at least improve the commit message.  As Alexander so nicely
explained[1], this prefetch purpose: "the 2 prefetches are needed for x86
if you want a full TCP or IPv6 header pulled into the L1 cache for
instance."  Although, this is not true for a minimum TCP-packet
Eth(14)+IP(20)+TCP(20)=54 bytes. An I missing an alignment in my calc?

[1] 
https://lore.kernel.org/netdev/CAKgT0UeEL3W42eDqSt97xnn3tXDtWMf4sdPByAtvbx=z7sx...@mail.gmail.com/

The name net_prefetch_headers() suggested by Jakub makes sense, as this
indicate that this should be used for prefetching packet headers.

As Alexander also explained, I was wrong in thinking the HW DCU (Data
Cache Unit) prefetcher will fetch two cache-lines automatically.  As
the DCU prefetcher is a streaming prefetcher, and doesn't see our
access pattern, which is why we need this.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

Reply via email to