Apologies, I missed attaching the patch earlier. Please find the v2 version attached.
Best Regards, Nitin Jadhav Azure Database for PostgreSQL Microsoft On Thu, Dec 4, 2025 at 12:01 PM Nitin Jadhav <[email protected]> wrote: > > The patch wasn’t applying cleanly on master, so I’ve rebased it and > also added it to the PG19‑4 CommitFest: > https://commitfest.postgresql.org/patch/6279/ > Please review and share your feedback. > > Best Regards, > Nitin Jadhav > Azure Database for PostgreSQL > Microsoft > > Best Regards, > Nitin Jadhav > Azure Database for PostgreSQL > Microsoft > > > On Fri, Feb 21, 2025 at 4:29 PM Nitin Jadhav > <[email protected]> wrote: > > > > Hi, > > > > In [1], Andres reported a bug where PostgreSQL crashes during recovery > > if the segment containing the redo pointer does not exist. I have > > attempted to address this issue and I am sharing a patch for the same. > > > > The problem was that PostgreSQL did not PANIC when the redo LSN and > > checkpoint LSN were in separate segments, and the file containing the > > redo LSN was missing, leading to a crash. Andres has provided a > > detailed analysis of the behavior across different settings and > > versions. Please refer to [1] for more information. This issue arises > > because PostgreSQL does not PANIC initially. > > > > The issue was resolved by ensuring that the REDO location exists once > > we successfully read the checkpoint record in InitWalRecovery(). This > > prevents control from reaching PerformWalRecovery() unless the WAL > > file containing the redo record exists. A new test script, > > 044_redo_segment_missing.pl, has been added to validate this. To > > populate the WAL file with a redo record different from the WAL file > > with the checkpoint record, I wait for the checkpoint start message > > and then issue a pg_switch_wal(), which should occur before the > > completion of the checkpoint. Then, I crash the server, and during the > > restart, it should log an appropriate error indicating that it could > > not find the redo location. Please let me know if there is a better > > way to reproduce this behavior. I have tested and verified this with > > the various scenarios Andres pointed out in [1]. Please note that this > > patch does not address error checking in StartupXLOG(), > > CreateCheckPoint(), etc., nor does it focus on cleaning up existing > > code. > > > > Attaching the patch. Please review and share your feedback. Thanks to > > Andres for spotting the bug and providing the detailed report [1]. > > > > [1]: > > https://www.postgresql.org/message-id/20231023232145.cmqe73stvivsmlhs%40awork3.anarazel.de > > > > Best Regards, > > Nitin Jadhav > > Azure Database for PostgreSQL > > Microsoft
v2-0001-Fix-crash-during-recovery-when-redo-segment-is-missi.patch
Description: Binary data
