Hi Reinette, Richard,

On 6/26/26 04:26, Reinette Chatre wrote:
> +Ben
> 
> Hi Richard,
> 
> On 5/28/26 7:23 PM, Richard Cheng wrote:
>> cl_flush() and sb() in fill_buf.c only have implementations for i386
>> and x86_64, so on aarch64 both compile to empty functions. mem_flush()
>> then walks the buffer calling a no-op cl_flush() per cache line and
>> finishes with a no-op sb(), leaving any caller that expects a flushed
>> buffer (e.g. CMT, L3_CAT) operating on unflushed state with no warning.
>>
>> Add an aarch64 code block using the ARM equivalents:
>> * "dc civac, %0" for cl_flush()
>> * "dsb sy"       for sb()
> 
> Calling on Arm experts here since my superficial check found sfence to
> be used for __wmb() on x86 and the Arm equivalent per
> arch/arm64/include/asm/barrier.h appears to be "dsb st"?

Referring to the arm reference manual (DDI0487 version M.a.a):
D7.5.9.15 Ordering and completion of data and instruction cache
instructions
This talks about using dsb for the synchronization and also states:
"In all cases, where the text in this section refers to a DMB or a DSB,
this means a DMB or DSB whose required access type is both loads and
stores."

Hence, in this case a "dsb st" is insufficient as the required access
type is loads but not stores. A full "dsb sy" would work to synchronize
the "dc civac".

However, I don't think "dc civac" fulfills the role of what is expected
of cl_flush().

> 
> Even so, it looks like the changes below were considered by Ben during
> a previous submission but I am not able to tell if his feedback was taken
> into account here.
> Please see:
> https://lore.kernel.org/lkml/[email protected]/
> https://lore.kernel.org/lkml/[email protected]/

My understanding is that the resctrl selftests want to use cl_flush(),
to invalidate entries in a system level cache for testing the cache
portion bitmaps. However, the mechanism to invalidate the system level
cache is generally implementation defined.

> 
>>
>> Both instructions are EL0-accessible on Linux aarch64.
>>
>> Signed-off-by: Richard Cheng <[email protected]>
>> ---
>>  tools/testing/selftests/resctrl/fill_buf.c | 6 ++++++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/tools/testing/selftests/resctrl/fill_buf.c 
>> b/tools/testing/selftests/resctrl/fill_buf.c
>> index 19a01a52dc1a..a41d21e5a64e 100644
>> --- a/tools/testing/selftests/resctrl/fill_buf.c
>> +++ b/tools/testing/selftests/resctrl/fill_buf.c
>> @@ -27,6 +27,9 @@ static void sb(void)
>>  #if defined(__i386) || defined(__x86_64)
>>      asm volatile("sfence\n\t"
>>                   : : : "memory");
>> +#elif defined(__aarch64__)
>> +    asm volatile("dsb sy\n\t"
>> +                 : : : "memory");
>>  #endif
>>  }
>>  
>> @@ -35,6 +38,9 @@ static void cl_flush(void *p)
>>  #if defined(__i386) || defined(__x86_64)
>>      asm volatile("clflush (%0)\n\t"
>>                   : : "r"(p) : "memory");
>> +#elif defined(__aarch64__)
>> +    asm volatile("dc civac, %0\n\t"
>> +                 : : "r"(p) : "memory");


This will only clean to the Point of Coherency (PoC).

To quote the arm reference manual (DDI0487 version M.a.a):
D7.5.9.2 The data cache maintenance instruction (DC)

If there are caches after the Point of Coherency and FEAT_PoPS is not
implemented, then the DC CIVAC and DC CIGDVAC instructions are not
sufficient to remove all copies of a poisoned Location and it is
IMPLEMENTATION DEFINED whether any IMPLEMENTATION DEFINED mechanism
exists to remove poison from a Location.

In most systems the slc, where your MPAM cache portions are likely to
be, will be past the PoC and I'd not expect FEAT_PoPS to be implemented.

Thanks,

Ben


>>  #endif
>>  }
>>  
> 
> Reinette


Reply via email to