eal: add temporal store memcpy support for AMD platform

Ananyev, Konstantin Wed, 27 Oct 2021 05:23:26 -0700

 
> 
> Hi Mattias,
> 
> > > 6) What is the use-case for this? When would a user *want* to use this 
> > > instead
> > of rte_memcpy()?
> > > If the data being loaded is relevant to datapath/packets, presumably other
> > packets might require the
> > > loaded data, so temporal (normal) loads should be used to cache the source
> > data?
> >
> >
> > I'm not sure if your first question is rhetorical or not, but a memcpy()
> > in a NT variant is certainly useful. One use case for a memcpy() with
> > temporal loads and non-temporal stores is if you need to archive packet
> > payload for (distant, potential) future use, and want to avoid causing
> > unnecessary LLC evictions while doing so.
> 
> Yes I agree that there are certainly benefits in using cache-locality hints.
> There is an open question around if the src or dst or both are non-temporal.
> 
> In the implementation of this patch, the NT/T type of store is reversed from 
> your use-case:
> 1) Loads are NT (so loaded data is not cached for future packets)
> 2) Stores are T (so copied/dst data is now resident in L1/L2)
> 
> In theory there might even be valid uses for this type of memcpy where loaded
> data is not needed again soon and stored data is referenced again soon,
> although I cannot think of any here while typing this mail..
> 
> I think some use-case examples, and clear documentation on when/how to choose
> between rte_memcpy() or any (potential future) rte_memcpy_nt() variants is 
> required
> to progress this patch.
> 
> Assuming a strong use-case exists, and it can be clearly indicators to users 
> of DPDK APIs which
> rte_memcpy() to use, we can look at technical details around enabling the 
> implementation.
>


+1 here.
Function behaviour and restrictions (src parameter needs to be 16/32 B aligned, 
etc.),
along with expected usage scenarios have to be documented properly.
Again, as Harry pointed out, I don't see any AMD specific instructions in this 
function,
so presumably such function can go into __AVX2__ code block and no new defines 
will
be required.

Re: [dpdk-dev] [PATCH v4 2/2] lib/eal: add temporal store memcpy support for AMD platform

Reply via email to