On Mon, Nov 18, 2024, at 13:22, Maciej W. Rozycki wrote: > On Mon, 18 Nov 2024, Arnd Bergmann wrote: > >> > This patch series addresses these issues in the last two changes, having >> > made generic test suite updates to improve coverage in the concerned area >> > first and then having addressed a bunch of issues in the code affected I >> > discovered in the course of this effort. There is a patch that includes >> > pair of changes to the middle end (reload+LRA) required by this update as >> > well, so it's not a purely backend-limited change, and hence no "Alpha:" >> > prefix on the cover letter or the relevant patches. >> >> I don't know enough about gcc internals to understand exactly >> which problems you are addressing, but I think there are still >> two issues that I identifier that remain unchanged here: >> >> a) storing an '_Atomic' variable smaller than 8 bytes on non-bwx >> targets should use ll/sc, but uses a plain rmw cycle. > > I think there should be no problem with aligned 4-byte (longword) data > quantities, given that there are both plain and locked/conditional load > and store machine operations provided by the non-BWX architecture. Why do > you think otherwise?
It's quite possible that I misremembered this and it was only a problem on 1-byte and 2-byte stores, even for _Atomic. > As to byte and aligned 2-byte (word) data quantities given that with > `-msafe-bwa' *all* stores of these quantities are made via an LL/SC > sequence (with a non-BWX target) I believe this has been sorted with my > proposal: > > $ cat atomic16_t.c > typedef _Atomic unsigned short atomic16_t; > > void store_atomic16 (atomic16_t *p, unsigned short v) > { > *p = v; > } > $ alpha-linux-gnu-gcc -msafe-bwa -O2 -S atomic16_t.c -o atomic16_t-bwa.c > $ sed -n '/\.ent/,/\.end/p' <atomic16_t-bwa.s > This does use an LL/SC sequence. Original non-BWX sequence uses plain > unaligned load/store operations instead: Right, with -msafe-bwa this would be covered, but I'm not sure if you'd expect to build all of userspace with that flag as well. I can certainly see how one would argue that userspace doesn't need to use LL/SC for non-atomic subword access, but at the same time I think the RMW sequence on _Atomic variables is a clear bug that you'd need to fix also for -mno-safe-bwa. >> b) Accessing a 'volatile' variable may have conflicting >> requirements, and I'm not sure if either an ll/sc or a rmw >> is actually correct since neither is free of side-effects. >> Linux currently assumes that a store through a 'volatile >> short *' pointer does not clobber adjacent data which >> would need an atomic, but most CPUs (not sure about Alpha) >> trap if you ever try an atomic operation on an MMIO >> register address. > > To avoid triggering side effects Alpha system chipsets define a sparse > I/O space decoding window where data locations are spaced apart such that > no BWX operations are required to read or write individual 8-bit, or > 16-bit, or even 24-bit peripheral registers. With DEC's own peripheral > bus solutions it may not have been always necessary, but surely it has > been to support PCI with the original Alpha implementation (EV4). We do > have support for sparse I/O space operations in Linux, we've always had. > > Does my reply address your concerns? It does address the immediate concern about MMIO registers, but I think there is still an open question regarding what the correct behavior on volatile variables should be in the absence of -msafe-bwa. Arnd