Hi Reinette, Richard, On 6/26/26 04:26, Reinette Chatre wrote: > +Ben > > Hi Richard, > > On 5/28/26 7:23 PM, Richard Cheng wrote: >> cl_flush() and sb() in fill_buf.c only have implementations for i386 >> and x86_64, so on aarch64 both compile to empty functions. mem_flush() >> then walks the buffer calling a no-op cl_flush() per cache line and >> finishes with a no-op sb(), leaving any caller that expects a flushed >> buffer (e.g. CMT, L3_CAT) operating on unflushed state with no warning. >> >> Add an aarch64 code block using the ARM equivalents: >> * "dc civac, %0" for cl_flush() >> * "dsb sy" for sb() > > Calling on Arm experts here since my superficial check found sfence to > be used for __wmb() on x86 and the Arm equivalent per > arch/arm64/include/asm/barrier.h appears to be "dsb st"?
Referring to the arm reference manual (DDI0487 version M.a.a): D7.5.9.15 Ordering and completion of data and instruction cache instructions This talks about using dsb for the synchronization and also states: "In all cases, where the text in this section refers to a DMB or a DSB, this means a DMB or DSB whose required access type is both loads and stores." Hence, in this case a "dsb st" is insufficient as the required access type is loads but not stores. A full "dsb sy" would work to synchronize the "dc civac". However, I don't think "dc civac" fulfills the role of what is expected of cl_flush(). > > Even so, it looks like the changes below were considered by Ben during > a previous submission but I am not able to tell if his feedback was taken > into account here. > Please see: > https://lore.kernel.org/lkml/[email protected]/ > https://lore.kernel.org/lkml/[email protected]/ My understanding is that the resctrl selftests want to use cl_flush(), to invalidate entries in a system level cache for testing the cache portion bitmaps. However, the mechanism to invalidate the system level cache is generally implementation defined. > >> >> Both instructions are EL0-accessible on Linux aarch64. >> >> Signed-off-by: Richard Cheng <[email protected]> >> --- >> tools/testing/selftests/resctrl/fill_buf.c | 6 ++++++ >> 1 file changed, 6 insertions(+) >> >> diff --git a/tools/testing/selftests/resctrl/fill_buf.c >> b/tools/testing/selftests/resctrl/fill_buf.c >> index 19a01a52dc1a..a41d21e5a64e 100644 >> --- a/tools/testing/selftests/resctrl/fill_buf.c >> +++ b/tools/testing/selftests/resctrl/fill_buf.c >> @@ -27,6 +27,9 @@ static void sb(void) >> #if defined(__i386) || defined(__x86_64) >> asm volatile("sfence\n\t" >> : : : "memory"); >> +#elif defined(__aarch64__) >> + asm volatile("dsb sy\n\t" >> + : : : "memory"); >> #endif >> } >> >> @@ -35,6 +38,9 @@ static void cl_flush(void *p) >> #if defined(__i386) || defined(__x86_64) >> asm volatile("clflush (%0)\n\t" >> : : "r"(p) : "memory"); >> +#elif defined(__aarch64__) >> + asm volatile("dc civac, %0\n\t" >> + : : "r"(p) : "memory"); This will only clean to the Point of Coherency (PoC). To quote the arm reference manual (DDI0487 version M.a.a): D7.5.9.2 The data cache maintenance instruction (DC) If there are caches after the Point of Coherency and FEAT_PoPS is not implemented, then the DC CIVAC and DC CIGDVAC instructions are not sufficient to remove all copies of a poisoned Location and it is IMPLEMENTATION DEFINED whether any IMPLEMENTATION DEFINED mechanism exists to remove poison from a Location. In most systems the slc, where your MPAM cache portions are likely to be, will be past the PoC and I'd not expect FEAT_PoPS to be implemented. Thanks, Ben >> #endif >> } >> > > Reinette

