https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68908
--- Comment #6 from Martin Sebor <msebor at gcc dot gnu.org> --- (In reply to Jakub Jelinek from comment #2) > Doesn't seem to be ppc64le specific in any way, and doesn't affect just > preincrement. The inefficiency I was pointing out was the redundant syncs above the loop on powerpc64. The x86_64 assembly looks fairly efficient both ways. I also intentionally focused the bug on the increment expression and didn't mention others like compound assignment because I expected the former to be more common. But I suppose ++a really should be equally as efficient as a += 1 which shouldn't be any less efficient than a += X for any arbitrary X. If it's preferable to treat this as a generic opportunity to improve the efficiency of all atomic expressions (perhaps along with those discussed on the Wiki: https://gcc.gnu.org/wiki/Atomic/GCCMM/Optimizations) that sounds great.