On 29.01.25 12:14, Segher Boessenkool wrote:
On Wed, Jan 29, 2025 at 10:45:10AM +0100, Julian Vetter wrote:
Remove the eieio() calls in IO functions for PowerPC. While other
architectures permit prefetching, combining, and reordering, the eieio()
calls on PowerPC prevent such optimizations.
Yes, and it is crucial to prevent combining, it is part of the semantics
of these functions. This is a much bigger problem on PowerPC than on
architectures which optimise memory accesses much less. So most other
archs can get away with it much easier (but it is still completely wrong
there).
You are keeping the trap;isync things, which a) have a way bigger
performance impact, and b) are merely a debugging aid (if some i/o
access kills the system, it will be clear where that came from). And
that isn't even the biggest thing of course, there is a heavyweight
sync in there as well. Is there any benefit to this patch, or is it
only sabotage?
Hello Segher,
thank you for your explanation. Yes, indeed, it was a bit rude to just
send this patch out of the blue. I should have explained what the
purpose of this patch was in the first place. I would like to align
"most" arch specific implementations of memcpy_fromio, memcpy_toio, and
memset_io, so they can all use the one from lib/iomem_copy.c instead of
having all their own. So, I wanted to first bring the implementation on,
e.g., powerpc closer to the "generic" one. Because I had the impression
that it's very similar, except this eieio() after every read.
But if it's really mandatory to have these eieio() instructions, then
the alternative be to either keep the powerpc implementation or add a
generic '#define __io_mbr' (or different name) that is called after
every read and resolves to eieio() on powerpc, and maybe to nothing or
something else on other architectures.
Julian
Segher