On Mon, May 26, 2025 at 12:05 PM Tom Lane <t...@sss.pgh.pa.us> wrote: > Thomas Munro <thomas.mu...@gmail.com> writes: > > Could you guys please share your exact repro steps? > > I've just been running 027_stream_regress.pl over and over. > It's not a recommendable answer though because the failure > probability is tiny, under 1%. It sounded like Alexander > had a better way.
Could you please share your configure options? While flailing around in the dark and contemplating sources of nondeterminism that might come from a small system under a lot of load (as hinted at by Alexander's mention of running the test in parallel) with a 1MB buffer pool (as used by 027_stream_read.pl via Cluster.pm's settings for replication tests), I thought about partial reads: --- a/src/backend/storage/aio/aio_io.c +++ b/src/backend/storage/aio/aio_io.c @@ -128,6 +128,8 @@ pgaio_io_perform_synchronously(PgAioHandle *ioh) result = pg_preadv(ioh->op_data.read.fd, iov, ioh->op_data.read.iov_length, ioh->op_data.read.offset); + if (result > BLCKSZ && rand() < RAND_MAX / 2) + result = BLCKSZ; ... and the fallback path for io_method=worker that runs IOs synchronous when the submission queue overflows because the I/O workers aren't keeping up: --- a/src/backend/storage/aio/method_worker.c +++ b/src/backend/storage/aio/method_worker.c @@ -253,7 +253,7 @@ pgaio_worker_submit_internal(int nios, PgAioHandle *ios[]) for (int i = 0; i < nios; ++i) { Assert(!pgaio_worker_needs_synchronous_execution(ios[i])); - if (!pgaio_worker_submission_queue_insert(ios[i])) + if (rand() < RAND_MAX / 2 || !pgaio_worker_submission_queue_insert(ios[i])) { /* * We'll do it synchronously, but only after we've sent as many as ... but still no dice here...