On 12/06/2025 4:13 pm, Andres Freund wrote:
Hi,
On 2025-06-12 15:12:00 +0300, Konstantin Knizhnik wrote:
Reproduced it once again with with write-protected io handle.
But once again - no access violation, just assert failure.
Previously "op" field was overwritten somewhere between `pgaio_io_reclaim`
and `AsyncReadBuffers`:
!!!pgaio_io_reclaim [20376]| ioh: 0x1019bc000, ioh->op: 0, ioh->generation:
19346
!!!AsyncReadBuffers [20376] (1)| blocknum: 21, ioh: 0x1019bc000, ioh->op: 1,
ioh->state: 1, ioh->result: 0, ioh->num_callbacks: 0, ioh->generation: 19346
Now it is overwritten after print in AsyncReadBuffers:
!!!pgaio_io_reclaim [88932]| ioh: 0x105a5c000, ioh->op: 0, ioh->generation:
42848
!!!pgaio_io_acquire_nb[88932]| ioh: 0x105a5c000, ioh->op: 0,
ioh->generation: 42848
!!!AsyncReadBuffers [88932] (1)| blocknum: 10, ioh: 0x105a5c000, ioh->op: 0,
ioh->state: 1, ioh->result: 0, ioh->num_callbacks: 0, ioh->generation: 42848
!!!pgaio_io_before_start| ioh: 0x105a5c000, ioh->op: 1, ioh->state: 1,
ioh->result: 0, ioh->num_callbacks: 2, ioh->generation: 42848
In this run I prohibit writes to io handle in `pgaio_io_acquire_nb` and
reenable them in `AsyncReadBuffer`.
I'm reasonably certain I found the issue, I think it's a missing memory
barrier on the read side. The CPU is reordering the read (or just using a
cached value) of ->distilled_result to be before the load of ->state.
But it'll take a bit to verify that that's the issue...
It is great!
But I wonder how it correlates with your previous statement:
There shouldn't be any concurrent accesses here, so I don't really see how the
above would explain the problem (the IO can only ever be modified by one
backend, initially the "owning backend", then, when submitted, by the IO
worker, and then again by the backend).
This is what I am observing myself: "op" field is modified and fetched
by the same process.
Certainly process can be rescheduled to some other CPU. But if such
reschedule can cause loose of stored value, then nothing will work, will it?
So assume that there is some variable "x" which is updated by process
"x=1" executed at CPU1, then process is rescheduled to CPU2 which does
"x=2", then process is once again rescheduled to CPU1 and we found out
that "x==1". And to prevent it we need to explicitly enforce some
memory barrier. Unless I missing something , nothing can work with such
memory model.