On 2022-08-09 17:26, Stephen Hemminger wrote:
On Tue, 9 Aug 2022 11:46:19 +0200
Morten Brørup <m...@smartsharesystems.com> wrote:
I don't think memcpy() functions should have alignment requirements.
That's not very practical, and violates the principle of least
surprise.
I didn't make the CPUs with these alignment requirements.
However, I will offer optimized performance in a generic NT memcpy() function
in the cases where the individual alignment requirements of various CPUs happen
to be met.
Rather than making a generic equivalent memcpy function, why not have
something which only takes aligned data. And to avoid user confusion
change the name to be something not suggestive of memcpy.
Alignment seems like a non-issue to me. A NT-store memcpy() can be made
free of alignment requirements, incurring only a very slight cost for
the always-aligned case (who has their data always 16-byte aligned
anyways?).
The memory barrier required on x86 seems like a bigger issue.
Maybe rte_non_cache_copy()?
rte_memcpy_nt_weakly_ordered(), or rte_memcpy_nt_weak(). And a
rte_memcpy_nt() with the sfence is place, which the user hopefully will
find first? I don't know. I would prefer not having the weak variant at all.
Accepting weak memory ordering (i.e., no sfence) could also be one of
the flags, assuming rte_memcpy_nt() would have a flags parameter.
Default is safe (=memcpy() semantics), but potentially slower.
Want to avoid the naive user just doing s/memcpy/rte_memcpy_nt/ and expect
everything to work.