Hi all, Beginning a new thread because the ext4 issues are closed, and because pg_basebackup data durability meritates a new thread. And in short about the problem: pg_basebackup makes no effort in being sure that the data it backs up is on disk, which is bad... One possible recommendation is to use initdb -S after running pg_basebackup, but making sure that data is on disk should be done before pg_basebackup ends.
On Thu, May 12, 2016 at 8:09 PM, I wrote: > And actually this won't fly high if there is no equivalent of > walkdir() or if the fsync()'s are not applied recursively. On master > at least the refactoring had better be done cleanly first... For the > back branches, we could just have some recursive call like > fsync_recursively and keep that in src/bin/pg_basebackup. Andres, do > you think that this should be part of fe_utils or src/common/? I'd > tend to think the latter is more adapted as there is an equivalent in > the backend. On back-branches, we could just have something like > fsync_recursively that walks though the paths. An even more simple > approach would be to fsync() individually things that have been > written, but that would suck in performance. So, attached are two patches that apply on HEAD to address the problem of pg_basebackup that does not sync the data it writes. As pg_basebackup cannot use directly initdb -S because, as a client-side utility, it may be installed while initdb is not (see Fedora and RHEL), I have refactored the code so as the routines in initdb.c doing the fsync of PGDATA and other fsync stuff are in src/fe_utils/, and this is 0001. Patch 0002 is a set of fixes for pg_basebackup: - In plain mode, fsync_pgdata is used so as all the tablespaces are fsync'd at once. This takes care as well of the case where pg_xlog is a symlink. - In tar mode (no stdout), each tar file is synced individually, and the base directory is synced once at the end. In both cases, failures are not considered fatal. With pg_basebackup -X and pg_receivexlog, the manipulation of WAL files is made durable by using fsync and durable_rename where needed (credits to Andres mainly for this part). This set of patches is aimed only at HEAD. Back-patchable versions of this patch would need to copy fsync_pgdata and friends into streamutil.c for example. I am adding that to the next CF for review as a bug fix. Regards, -- Michael
0001-Relocation-fsync-routines-of-initdb-into-fe_utils.patch
Description: application/download
0002-Issue-fsync-more-carefully-in-pg_receivexlog-and-pg_.patch
Description: application/download
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers