Re: [RFC v2] non-temporal memcpy

Mattias Rönnblom Thu, 11 Aug 2022 04:53:10 -0700

On 2022-08-10 23:20, Honnappa Nagarahalli wrote:

<snip>

From: Mattias Rönnblom [mailto:[email protected]]
Sent: Wednesday, 10 August 2022 13.56

On 2022-08-09 17:26, Stephen Hemminger wrote:


[...]


Alignment seems like a non-issue to me. A NT-store memcpy() can be
made free of alignment requirements, incurring only a very slight cost
for the always-aligned case (who has their data always 16-byte aligned
anyways?).

The memory barrier required on x86 seems like a bigger issue.

Maybe rte_non_cache_copy()?


rte_memcpy_nt_weakly_ordered(), or rte_memcpy_nt_weak(). And a
rte_memcpy_nt() with the sfence is place, which the user hopefully
will find first? I don't know. I would prefer not having the weak
variant at all.

I think providing weakly ordered version is required to offset the cost of the 
barriers. One might be able to copy multiple packets and then issue a barrier.


On what architecture?

I assumed that only x86 had the peculiar property of having differentmemory models for regular and NT load/stores.


Accepting weak memory ordering (i.e., no sfence) could also be one of
the flags, assuming rte_memcpy_nt() would have a flags parameter.
Default is safe (=memcpy() semantics), but potentially slower.


Excellent idea!

Want to avoid the naive user just doing s/memcpy/rte_memcpy_nt/ and

expect

everything to work.

Re: [RFC v2] non-temporal memcpy

Reply via email to