https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94544
Bug ID: 94544 Summary: aarch64 stlr and single total order Product: gcc Version: 9.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: wypelniamkonta at wp dot pl Target Milestone: --- Per https://en.cppreference.com/w/cpp/atomic/memory_order "std::memory_order specifies how memory accesses, including regular, non-atomic memory accesses, are to be ordered around an atomic operation." and "memory_order_seq_cst ... plus a single total order exists in which all threads observe all modifications in the same order" I conducted a test on Cortex-72 (rpi4) and: struct Atoms { std::atomic<uint64_t> a = 0; uint64_t space1[3]; uint64_t b = 0; uint64_t space2[3]; std::atomic<uint64_t> c = 0; }; Atoms at; core 1: for (int i=0;i<10000000;i++) { at.a.store(i, std::memory_order_seq_cst); at.b = i; at.c.store(i, std::memory_order_seq_cst); } 99198: c89ffc01 stlr x1, [x0] 9919c: f9000081 str x1, [x4] 991a0: c89ffc61 stlr x1, [x3] core 0: while (core1.isRunning() { dmb(); int a = at.a.load(std::memory_order_relaxed); dmb(); int b = at.b; dmb(); if (a<b) { Console::out("%i %i %i\n", b-a, a, b); } } 995e8: d5033fbf dmb sy 995ec: f94002a2 ldr x2, [x21] 995f0: d5033fbf dmb sy 995f8: f9400023 ldr x3, [x1] 995fc: d5033fbf dmb sy 99600: 6b03005f cmp x2, x3 And sometimes new value in b is observed before in a 1 9984712 9984713 1 9987016 9987017 This is because stlr is memory_order_release operation. If I add barrier at->a.store(i, std::memory_order_seq_cst); dmb(); at->b = i; at->c.store(i, std::memory_order_seq_cst); It of course works, but I don't think that original code hits "single total order exists" requirement. (similar topic with ldar)