On Mon, Jan 23, 2023 at 7:04 AM Ian Lance Taylor <i...@golang.org> wrote:
> Memory ordering only makes sense in terms of two different execution > threads using shared memory. In order to answer your question > precisely, you need to tell us what the process reading the memory > region is going to do to access the memory. In order to know how to > write the memory, it's necessary to know how the memory is going to be > read. > That's a fair point. I avoided going into details not to risk tickling latent design-urges of the readers ;) Setup: - Single-writer multiple-readers scenario - Writer is always exclusively single threaded, no concurrency whatsoever. Only possible sources of operation reordering are: a) the discrete CPU execution pipeline b) the compiler itself c) OS preemption/ SMP migration. - Communication is over a single massive mmaped file-backed region. - Exploits the fact that on Linux the VFS cache in front of the named file and the mmaped "window" within every process are all literally the same kernel memory. - Communication is strictly one-way: writer does not know nor care about the amount of readers, what are they looking at, etc. - Readers are expected to accommodate above, be prepared to look at stale data, etc - For simplicity assume that the file/mmap is of unreachable size ( say 1PiB ) and that additions are all appends, with no garbage collection - stale data which is not referenced by anything just sticks around indefinitely. Writer pseudocode ( always only one thread, *has exclusive write access* ) 1. Read current positioning from mmap offset 0 - no locks needed since I am the one who modified things last 2. Do the payload writes, several GiB append within the unused portion of the mmap 3. Writeout necessary indexes and pointers to the contents of 2, another append this time several KiB 4. {{ MY QUESTION }} Emit a SFENCE <https://c9x.me/x86/html/file_module_x86_id_289.html>/LOCK <https://c9x.me/x86/html/file_module_x86_id_159.html>(amd64) or DMB <https://developer.arm.com/documentation/dui0489/c/arm-and-thumb-instructions/miscellaneous-instructions/dmb--dsb--and-isb>(arm64) to ensure sequencing consistency and that all CPUs see the same state of the kernel memory backing the mmap 5. Write a single uint64 at mmap offset 0, pointing to the new "state of the world" written during 3. which in turn points at various pieces of data written in 2. 6. goto 1 Readers pseudocode ( many readers, various implementation languages not just go, utterly uncoordinated, happy to see "old transaction", but *expect 5 => 3 => 2 to be always consistent* ) 1. Read current positioning from mmap offset 0 - no locks as I am equally happy to see the new or old uint64. I do assume that a word-sized read is always atomic, and I won't see a "torn" u64 2. Walk around either the new or old network of pointers. The barrier 4. in the writer ensures I can't see a pointer to something that doesn't yet exist. The end. > There is no Go equivalent to a pure write memory barrier. Ian, I recognize I am speaking to one of the language creators and that you know *way* more than me about this subject. Nevertheless I find it really hard to accept your statement. There got to be a set of constructs that have the desired side-effects described in 4 above. I also still maintain that the memory model should discuss this, in the compilation guarantees section at the bottom. After all a standalone go program is nothing more than a list of instructions for a CPU mediated by an OS. The precise sequencing of these instructions in special circumstances should be clear/controllable. I guess I will spend some time to learn how to poke around the generated assembly tomorrow... -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/CAMrvTSLfv8wYXZsE8Hf4Cv6iuOKRjmU6ZR9VYVVQaT52Fcgn3w%40mail.gmail.com.