On Thu, Aug 4, 2022 at 11:59 AM Bharath Rupireddy <bharath.rupireddyforpostg...@gmail.com> wrote: > > On Sun, Jul 31, 2022 at 8:36 PM mahendrakar s > <mahendrakarfo...@gmail.com> wrote: > > > >> On Mon, 25 Jul 2022 at 16:42, Bharath Rupireddy > >> <bharath.rupireddyforpostg...@gmail.com> wrote: > >> Here's the v3 patch after rebasing. > >> > >> I just would like to reiterate the issue the patch is trying to solve: > >> At times (when no space is left on the device or when the process is > >> taken down forcibly (VM/container crash)), there can be leftover > >> uninitialized .partial files (note that padding i.e. filling 16MB WAL > >> files with all zeros is done in non-compression cases) due to which > >> pg_receivewal fails to come up after the crash. To address this, the > >> proposed patch makes the padding 16MB WAL files atomic (first write a > >> .temp file, pad it and durably rename it to the original .partial > >> file, ready to be filled with received WAL records). This approach is > >> similar to what core postgres achieves atomicity while creating new > >> WAL file (see how XLogFileInitInternal() creates xlogtemp.%d file > >> first and then how InstallXLogFileSegment() durably renames it to > >> original WAL file). > >> > >> Thoughts? > > > > Hi Bharath, > > > > Idea to atomically allocate WAL file by creating tmp file and renaming it > > is nice. > > Thanks for reviewing it. > > > I have one question though: > > How is partially temp file created will be cleaned if the VM crashes or out > > of disk space cases? Does it endup creating multiple files for every VM > > crash/disk space during process of pg_receivewal? > > > > Thoughts? > > It is handled in the patch, see [1]. > > Attaching v4 patch herewith which now uses the temporary file suffix > '.tmp' as opposed to v3 patch '.temp'. This is just to be in sync with > other atomic file write codes in the core - autoprewarm, > pg_stat_statement, slot, basebacup, replorigin, snapbuild, receivelog > and so on. > > Please review the v4 patch.
I've done some more testing today (hacked the code a bit by adding pg_usleep(10000L); in pre-padding loop and crashing the pg_receivewal process to produce the warning [1]) and found that the better place to remove ".partial.tmp" leftover files is in FindStreamingStart() because there we do a traversal of all the files in target directory along the way to remove if ".partial.tmp" file(s) is/are found. Please review the v5 patch further. [1] pg_receivewal: warning: segment file "0000000100000006000000B9.partial" has incorrect size 15884288, skipping -- Bharath Rupireddy RDS Open Source Databases: https://aws.amazon.com/rds/postgresql/
v5-0001-Avoid-partially-padded-files-under-pg_receivewal-.patch
Description: Binary data