On 13/06/2025 11:20 pm, Andres Freund wrote:
Attached is a patch that fixes the problem for me. Alexander, Konstantin,
could you verify that it also fixes the problem for you?

Given that it does address the problem for me, I'm inclined to push this
fairly soon, the barrier is pretty obviously required.

Unfortunately I still able to reproduce assertion failure after 17450 seconds (`--enable-debug --enable-cassert CFLAGS=-O0`):


  * frame #0: 0x0000000187248704 libsystem_kernel.dylib`__pthread_kill + 8
    frame #1: 0x000000018727fc28 libsystem_pthread.dylib`pthread_kill + 288
    frame #2: 0x000000018718dae8 libsystem_c.dylib`abort + 180
    frame #3: 0x00000001011a0fbc postgres`ExceptionalCondition(conditionName="ioh->op == PGAIO_OP_INVALID", fileName="aio_io.c", lineNumber=161) at assert.c:66:2     frame #4: 0x0000000100ea9b40 postgres`pgaio_io_before_start(ioh=0x000000010a6e22f0) at aio_io.c:161:2     frame #5: 0x0000000100ea9a00 postgres`pgaio_io_start_readv(ioh=0x000000010a6e22f0, fd=12, iovcnt=1, offset=85565440) at aio_io.c:81:2     frame #6: 0x0000000100ec8798 postgres`FileStartReadV(ioh=0x000000010a6e22f0, file=4, iovcnt=1, offset=85565440, wait_event_info=167772181) at fd.c:2241:2     frame #7: 0x0000000100f1b7d0 postgres`mdstartreadv(ioh=0x000000010a6e22f0, reln=0x000000010b0289c8, forknum=MAIN_FORKNUM, blocknum=10445, buffers=0x000000016f5ed998, nblocks=1) at md.c:1019:8     frame #8: 0x0000000100f1ed88 postgres`smgrstartreadv(ioh=0x000000010a6e22f0, reln=0x000000010b0289c8, forknum=MAIN_FORKNUM, blocknum=10445, buffers=0x000000016f5ed998, nblocks=1) at smgr.c:758:2     frame #9: 0x0000000100eb1a48 postgres`AsyncReadBuffers(operation=0x000000011d80d108, nblocks_progress=0x000000016f5edeb4) at bufmgr.c:1959:3     frame #10: 0x0000000100eb0c24 postgres`StartReadBuffersImpl(operation=0x000000011d80d108, buffers=0x000000011d80cdf4, blockNum=10445, nblocks=0x000000016f5edeb4, flags=0, allow_forwarding=true) at bufmgr.c:1428:18

I wonder if we also need second write barrier in pgaio_io_update_state:

    /*
     * Ensure the changes signified by the new state are visible before the
     * new state becomes visible.
     */
    pg_write_barrier();

    ioh->state = new_state;

    /*
     * Make sure that status is written
     */
    pg_write_barrier();


Otherwise is is not clear when status is written.
Also I think that replacing bitfields with `uint8` and may be even with `int`, is good idea at least to avoids false sharing.

Reply via email to