On 6 August 2018 at 13:19, Siddhesh Poyarekar <siddh...@gotplt.org> wrote: > On 08/06/2018 04:01 PM, Mikulas Patocka wrote: >> >> I think there are three possible solutions: >> >> 1. provide an alternative memcpy implementation that doesn't do unaligned >> accesses and recompile the graphics software with -mstrict-align > > > Given that there's already a tunable glibc.cpu.cached_memopt for powerpc > that (as Tulio clarified elsewhere) essentially does the same thing for > cache-inhibited memory, it wouldn't be too much of an overhead to put in > another ifunc implementation that gets chosen only when one sets this > tunable. In fact, we could reuse the C string routines for this to avoid > adding yet another assembly implementation to have to support. That way we > can minimally fix the issue at hand without regressing existing uses. > > You can then set the glibc.cpu.cached_memopt tunable in the default > environment for your board[1] or for applications that need it (e.g. > whenever DISPLAY is exported or something like that). > > The only difference from Power would be that cpu.noncached==0 for Power by > default whereas for aarch64 it will be the other way around. It shouldn't > be too hard to enhance the framework to set platform-specific defaults. >
Thanks Siddhesh, But we don't need another memcpy(). We need outbound PCIe windows that tolerate being mapped as normal non-cacheable memory. And if this is fundamentally impossible, can someone please try explaining it again? (apologies for being thick)