On Fri, 29 Jul 2022 12:13:52 +0000
Konstantin Ananyev <konstantin.anan...@huawei.com> wrote:

> Sorry, missed that part.
> 
> >   
> > > Another question - who will do 'sfence' after the copying?
> > > Would it be inside memcpy_nt (seems quite costly), or would
> > > it be another API function for that: memcpy_nt_flush() or so?  
> > 
> > Outside. Only the developer knows when it is required, so it wouldn't make 
> > any sense to add the cost inside memcpy_nt().
> > 
> > I don't think we should add a flush function; it would just be another name 
> > for an already existing function. Referring to the required
> > operation in the memcpy_nt() function documentation should suffice.
> >   
> 
> Ok, but again wouldn't it be arch specific?
> AFAIK for x86 it needs to boil down to sfence, for other architectures - I 
> don't know.
> If you think there already is some generic one (rte_wmb?) that would always 
> produce
> correct instructions - sure let's use it. 
>  
>  

It makes sense in a few select places to use non-temporal copy.
But it would add unnecessary complexity to DPDK if every function in DPDK that 
could
cause a copy had a non-temporal variant.

Maybe just having rte_memcpy have a threshold (config value?) that if copy is 
larger than
a certain size, then it would automatically be non-temporal.  Small copies 
wouldn't matter,
the optimization is more about not stopping cache size issues with large 
streams of data.

Reply via email to