On Thu, Mar 14, 2024 at 11:49:14AM +0000, Richard Earnshaw (lists) wrote: > On 14/03/2024 08:37, Jakub Jelinek wrote: > > Hi! > > > > The following testcase ICEs with LSE atomics. > > The problem is that the @atomic_compare_and_swap<mode> expander uses > > aarch64_reg_or_zero predicate for the desired operand, which is fine, > > given that for most of the modes and even for TImode in some cases > > it can handle zero immediate just fine, but the TImode > > @aarch64_compare_and_swap<mode>_lse just uses register_operand for > > that operand instead, again intentionally so, because the casp, > > caspa, caspl and caspal instructions need to use a pair of consecutive > > registers for the operand and xzr is just one register and we can't > > just store zero into the link register to emulate pair of zeros. > > > > So, the following patch fixes that by forcing the newval operand into > > a register for the TImode LSE case. > > > > Bootstrapped/regtested on aarch64-linux, ok for trunk? > > An alternative fix would be to use a mode_attr to pick a different predicate > for TImode. But that's probably just a matter of taste; I'm not sure that > one would be better than the other in reality.
The reason I didn't do something like that is that for the non-LSE case, @aarch64_compare_and_swap<mode> pattern actually handles const0_rtx desired operand and so does the TARGET_OUTLINE_ATOMICS case, so if it was done through a predicate, it would need to be a predicate that would be specific to this case and would yield register_operand for TImode && TARGET_LSE and and otherwise aarch64_reg_or_zero. The patch seems to be simpler to me, especially when aarch64_expand_compare_and_swap already does the forcing stuff into registers at various other spots. Jakub