On Fri, 2023-11-03 at 18:36 -0700, Andres Freund wrote: > All that is happening here is that clang can prove that nothing ever > could > change the variable
... > See https://godbolt.org/z/EaM77E8jK > > We don't gain anything from preventing the compiler from making such > conclusions afaict. Thank you. I modified this a bit to use a global variable and then perform a write outside the loop: https://godbolt.org/z/EfvGx64eE extern void DummyFunction(void); extern DummyStruct Dummy; DummyStruct Dummy = { 5 }; static DummyStruct *pDummy = &Dummy; void DummyFunction(void) { pDummy->recptr = 3; while(true) { if (pDummy->recptr == 0) break; pg_compiler_barrier(); } } And I think that mostly answers my question: if the compiler barrier is there, it loads each time; and if I take the barrier away, it optimizes into an infinite loop. I suppose what's happening without the compiler barrier is that the load is being moved up right below the store, which allows it to be optimized away entirely. The compiler barrier prevents moving the load, and therefore prevents that optimization. I'm still somewhat hazy on exactly what a compiler barrier prevents -- it seems like it depends on the internal details of how the compiler optimizes in stages. But I see that it's doing what we need it to do now. > > The key difference is that there's code changing relevant variables > :) Yeah. > I guess I don't really know what you mean with global or local > pointers? I meant "global pointers to shared memory" (like XLogCtl) vs "local pointers to shared memory" (like other_xids in TransactionIdIsActive()). > We could also implement this with a compiler barrier between fetching > pxid and > using it - but it'd potentially lead to worse code, because instead > of just > forcing one load to come from memory, it'd also force reloading > *other* > variables from memory. I see -- that's a way to be more selective about what gets reloaded since the last compiler barrier, rather than inserting a new compiler barrier which would cause everything to be reloaded. This was helpful, thank you again. I wanted to be more clear on these nuances while reviewing Bharath's patch, because I'm suggesting that he can avoid the WALBufMappingLock to reduce the risk of a regression. In the process, we'll probably get rid of that unnecessary "volatile" in AdvanceXLInsertBuffer(). Regards, Jeff Davis