Hi all, While investigating a bug in the patch to get rid of WALBufMappingLock, I found that the surrounding pg_atomic_compare_exchange_u64() fixes the problem for me. That doesn't feel right because, according to the comments, both pg_atomic_compare_exchange_u32() and pg_atomic_compare_exchange_u64() should provide full memory barrier semantics. So, I decided to investigate this further.
In my case, the implementation of pg_atomic_compare_exchange_u64() is based on __atomic_compare_exchange_n(). static inline bool pg_atomic_compare_exchange_u64_impl(volatile pg_atomic_uint64 *ptr, uint64 *expected, uint64 newval) { AssertPointerAlignment(expected, 8); return __atomic_compare_exchange_n(&ptr->value, expected, newval, false, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST); } According to the docs __ATOMIC_SEQ_CST enforces total ordering with *all other __ATOMIC_SEQ_CST operations*. It only says about other __ATOMIC_SEQ_CST operations; nothing is said about regular reads/writes. This sounds quite far from our comment, promising full barrier semantics, which, in my understanding, means the equivalent of both pg_read_barrier()/pg_write_barrier(), which should barrier *all other read/writes*. I've found that __atomic_compare_exchange_n() ends up with usage of __aarch64_cas8_acq_rel, which disassembles as following. 0000000000000860 <__aarch64_cas8_acq_rel>: 860: d503245f bti c 864: 90000110 adrp x16, 20000 <__data_start> 868: 3940a210 ldrb w16, [x16, #40] 86c: 34000070 cbz w16, 878 <__aarch64_cas8_acq_rel+0x18> 870: c8e0fc41 casal x0, x1, [x2] 874: d65f03c0 ret 878: aa0003f0 mov x16, x0 87c: c85ffc40 ldaxr x0, [x2] 880: eb10001f cmp x0, x16 884: 54000061 b.ne 890 <__aarch64_cas8_acq_rel+0x30> // b.any 888: c811fc41 stlxr w17, x1, [x2] 88c: 35ffff91 cbnz w17, 87c <__aarch64_cas8_acq_rel+0x1c> 890: d65f03c0 ret This function first checks if LSE instructions are present. If so, instruction LSE casal is used. If not (in my case), the loop of ldaxr/stlxr is used instead. The documentation says both ldaxr/stlxr provides one-way barriers. Read/writes after ldaxr will be observed after, and read/writes before stlxr will be observed before. So, a pair of these instructions provides a full memory barrier. However, if we don't observe the expected value, only ldaxr gets executed. So, an unsuccessful pg_atomic_compare_exchange_u64() attempt will leave us with a one-way barrier, and that caused a bug in my case. In contrast, __sync_val_compare_and_swap() uses __aarch64_cas8_sync, which is quite the same as __aarch64_cas8_acq_rel, but has an explicit memory barrier in the end. I have a couple of ideas on how to fix this problem. 1. Provide full barrier semantics for pg_atomic_compare_exchange_*(). In particular, for __atomic_compare_exchange_n(), we should probably use __ATOMIC_ACQ_REL for success and run an explicit memory barrier in the case of failure. 2. Document that pg_atomic_compare_exchange_*() doesn't necessarily provide a memory barrier on failure. But then we need to carefully check if existing use-cases need explicit memory barriers now. Any thoughts? Links 1. https://www.postgresql.org/message-id/CAPpHfdsWcQb-u-9K%3DipneBf8CMhoUuBWKYc%2BXWJEHVdtONOepQ%40mail.gmail.com 2. https://developer.arm.com/documentation/100941/0101/Barriers ------ Regards, Alexander Korotkov Supabase