On Wed, Apr 27, 2022 at 1:47 PM Bharath Rupireddy <bharath.rupireddyforpostg...@gmail.com> wrote: > > > > > > I've now done several runs with your patch and not seen the test failure. > > > However, I think we ought to rethink this API a bit rather than just > > > apply the patch as-is. Even if it were documented, relying on > > > errormsg = NULL to mean something doesn't seem like a great plan. > > > > Sorry for being late in the game, occupied with other stuff. > > > > How about using private_data of XLogReaderState for > > read_local_xlog_page_no_wait, something like this? > > > > typedef struct ReadLocalXLogPageNoWaitPrivate > > { > > bool end_of_wal; > > } ReadLocalXLogPageNoWaitPrivate; > > > > In read_local_xlog_page_no_wait: > > > > /* If asked, let's not wait for future WAL. */ > > if (!wait_for_wal) > > { > > private_data->end_of_wal = true; > > break; > > } > > > > /* > > * Opaque data for callbacks to use. Not used by XLogReader. > > */ > > void *private_data; > > I found an easy way to reproduce this consistently (I think on any server): > > I basically generated huge WAL record (I used a fun extension that I > wrote - https://github.com/BRupireddy/pg_synthesize_wal, but one can > use pg_logical_emit_message as well) > session 1: > select * from pg_synthesize_wal_record(1*1024*1024); --> generate 1 MB > of WAL record first and make a note of the output lsn (lsn1) > > session 2: > select * from pg_get_wal_records_info_till_end_of_wal(lsn1); > \watch 1 > > session 1: > select * from pg_synthesize_wal_record(1000*1024*1024); --> generate > ~1 GB of WAL record and we see ERROR: could not read WAL at XXXXX in > session 2. > > Delay the checkpoint (set checkpoint_timeout to 1hr) just not recycle > the wal files while we run pg_walinspect functions, no other changes > required from the default initdb settings on the server. > > And, Thomas's patch fixes the issue.
Here's v2 patch (up on Thomas's v1 at [1]) using private_data to set the end of the WAL flag. Please have a look at it. [1] https://www.postgresql.org/message-id/CA%2BhUKGLtswFk9ZO3WMOqnDkGs6dK5kCdQK9gxJm0N8gip5cpiA%40mail.gmail.com Regards, Bharath Rupireddy.
v2-0001-Fix-pg_walinspect-race-against-flush-LSN.patch
Description: Binary data