On Mon, Nov 18, 2024, at 13:22, Maciej W. Rozycki wrote:
> On Mon, 18 Nov 2024, Arnd Bergmann wrote:
>
>> >  This patch series addresses these issues in the last two changes, having 
>> > made generic test suite updates to improve coverage in the concerned area 
>> > first and then having addressed a bunch of issues in the code affected I 
>> > discovered in the course of this effort.  There is a patch that includes 
>> > pair of changes to the middle end (reload+LRA) required by this update as 
>> > well, so it's not a purely backend-limited change, and hence no "Alpha:" 
>> > prefix on the cover letter or the relevant patches.
>> 
>> I don't know enough about gcc internals to understand exactly
>> which problems you are addressing, but I think there are still
>> two issues that I identifier that remain unchanged here:
>> 
>> a) storing an '_Atomic' variable smaller than 8 bytes on non-bwx
>>    targets should use ll/sc, but uses a plain rmw cycle.
>
>  I think there should be no problem with aligned 4-byte (longword) data
> quantities, given that there are both plain and locked/conditional load 
> and store machine operations provided by the non-BWX architecture.  Why do 
> you think otherwise?

It's quite possible that I misremembered this and it was only
a problem on 1-byte and 2-byte stores, even for _Atomic.

>  As to byte and aligned 2-byte (word) data quantities given that with 
> `-msafe-bwa' *all* stores of these quantities are made via an LL/SC 
> sequence (with a non-BWX target) I believe this has been sorted with my 
> proposal:
>
> $ cat atomic16_t.c
> typedef _Atomic unsigned short atomic16_t;
>
> void store_atomic16 (atomic16_t *p, unsigned short v)
> {
>       *p = v;
> }

> $ alpha-linux-gnu-gcc -msafe-bwa -O2 -S atomic16_t.c -o atomic16_t-bwa.c
> $ sed -n '/\.ent/,/\.end/p' <atomic16_t-bwa.s

> This does use an LL/SC sequence.  Original non-BWX sequence uses plain 
> unaligned load/store operations instead:

Right, with -msafe-bwa this would be covered, but I'm not sure
if you'd expect to build all of userspace with that flag as well.

I can certainly see how one would argue that userspace doesn't
need to use LL/SC for non-atomic subword access, but at the same
time I think the RMW sequence on _Atomic variables is a clear
bug that you'd need to fix also for -mno-safe-bwa.

>> b) Accessing a 'volatile' variable may have conflicting
>>    requirements, and I'm not sure if either an ll/sc or a rmw
>>    is actually correct since neither is free of side-effects.
>>    Linux currently assumes that a store through a 'volatile
>>    short *' pointer does not clobber adjacent data which
>>    would need an atomic, but most CPUs (not sure about Alpha)
>>    trap if you ever try an atomic operation on an MMIO
>>    register address.
>
>  To avoid triggering side effects Alpha system chipsets define a sparse 
> I/O space decoding window where data locations are spaced apart such that 
> no BWX operations are required to read or write individual 8-bit, or 
> 16-bit, or even 24-bit peripheral registers.  With DEC's own peripheral 
> bus solutions it may not have been always necessary, but surely it has 
> been to support PCI with the original Alpha implementation (EV4).  We do 
> have support for sparse I/O space operations in Linux, we've always had.
>
>  Does my reply address your concerns?

It does address the immediate concern about MMIO registers, but I
think there is still an open question regarding what the correct
behavior on volatile variables should be in the absence of -msafe-bwa.

      Arnd

Reply via email to