https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80080
--- Comment #12 from Ilya Leoshkevich <iii at linux dot ibm.com> --- I've investigated foo3, foo4 and foo5, and came to the following conclusions: When foo3 is compiled with -march=z10 or later, cprop1 pass propagates global's SYMBOL_REF value into UNSPECV_CAS. On previous machines it does not happen, because the result is rejected by insn_invalid_p (). Then, reload realizes that SYMBOL_REF cannot be a legitimate UNSPECV_CAS argument, and loads it into a pseudo right before. The net result is that loading of SYMBOL_REF is moved from outside of the loop into the loop. So we need to somehow inhibit constant propagation for this case. Jump threading in foo4 does not work, because it's done only during `jump' pass, at which point there are insns with side-effects in the basic block of the 2nd jump. They are later deleted by the `combine' pass, but we don't request CLEANUP_THREADING after that. I wonder if we could introduce it? In addition, when foo4 is compiled with -O2 or -O3, we don't use conditional return, because our return sequence contains a PARALLEL, which is rejected by bb_is_just_return (). This can also be improved. Finally, in foo5 `cs' is generated by s390_expand_cs_tdsi (), and comparison is generated by common expansion logic, so it doesn't look possible to improve the situation solely in the back-end. We need to somehow make gcc aware that (oldval == 0) and (retval != 0) are equivalent after `cs', but I'm not sure at which point we could and should do this - in theory doing this on tree rather than RTL level can help other architectures.