On Fri, Apr 28, 2023 at 2:24 PM Drouvot, Bertrand <bertranddrouvot...@gmail.com> wrote: > > > Can you > > please explain the logic behind this test a bit more like how the WAL > > file switch helps you to achieve the purpose? > > > > The idea was to generate enough "wal switch" on the primary to ensure > the WAL file has been removed. > > I gave another thought on it and I think we can skip the test that the WAL is > not on the primary any more. That way, one "wal switch" seems to be enough > to see it removed on the standby. > > It's done in V7. > > V7 is not doing "extra tests" than necessary and I think it's probably better > like this. > > I can see V7 failing on "Cirrus CI / macOS - Ventura - Meson" only (other > machines are not complaining). > > It does fail on "invalidated logical slots do not lead to retaining WAL", see > https://cirrus-ci.com/task/4518083541336064 > > I'm not sure why it is failing, any idea? >
I think the reason for the failure is that on standby, the test is not able to remove the file corresponding to the invalid slot. You are using pg_switch_wal() to generate a switch record and I think you need one more WAL-generating statement after that to achieve your purpose which is that during checkpoint, the tes removes the WAL file corresponding to an invalid slot. Just doing checkpoint on primary may not serve the need as that doesn't lead to any new insertion of WAL on standby. Is your v6 failing in the same environment? If not, then it is probably due to the reason that the test is doing insert after pg_switch_wal() in that version. Why did you change the order of insert in v7? BTW, you can confirm the failure by changing the DEBUG2 message in RemoveOldXlogFiles() to LOG. In the case, where the test fails, it may not remove the WAL file corresponding to an invalid slot whereas it will remove the WAL file when the test succeeds. -- With Regards, Amit Kapila.