https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80878
--- Comment #46 from Luke Dalessandro <ldalessandro at gmail dot com> --- (In reply to Xi Ruoyao from comment #45) > (In reply to Luke Dalessandro from comment #44) > > Now that https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104688 was resolved is > > it possible to actually get the atomic/atomic_ref to generate cmpxchg16b? Or > > is this still blocked? As everyone who is trying to write lock-free > > algorithms has pointed out, not doing so is an issue. > > No, PR104688 means using vector load for 16B atomic when the CPU guarantees > the vector load is atomic. It has nothing to do with this issue. Okay, thanks. I was basically responding in the context of https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80878#c38 since I thought that the issue was that cmpxchg16b can't be used for loading, and thus loading needed locking, and thus using cmpxchg16b to implement compare_exchange would not properly synchronize with the load. But if 104688 isn't related to this issue, and thus Jakub's comment was in error, I definitely don't understand the underlying problem and why clang is fine doing it.