Re: [HACKERS] Allowing multiple concurrent base backups

Fujii Masao Mon, 31 Jan 2011 19:45:56 -0800

On Tue, Feb 1, 2011 at 1:31 AM, Heikki Linnakangas
<[email protected]> wrote:
> Hmm, good point. It's harmless, but creating the history file in the first
> place sure seems like a waste of time.


The attached patch changes pg_stop_backup so that it doesn't create
the backup history file if archiving is not enabled.

When I tested the multiple backups, I found that they can have the same
checkpoint location and the same history file name.

--------------------
$ for ((i=0; i<4; i++)); do
pg_basebackup -D test$i -c fast -x -l test$i &
done

$ cat test0/backup_label
START WAL LOCATION: 0/20000B0 (file 000000010000000000000002)
CHECKPOINT LOCATION: 0/20000E8
START TIME: 2011-02-01 12:12:31 JST
LABEL: test0

$ cat test1/backup_label
START WAL LOCATION: 0/20000B0 (file 000000010000000000000002)
CHECKPOINT LOCATION: 0/20000E8
START TIME: 2011-02-01 12:12:31 JST
LABEL: test1

$ cat test2/backup_label
START WAL LOCATION: 0/20000B0 (file 000000010000000000000002)
CHECKPOINT LOCATION: 0/20000E8
START TIME: 2011-02-01 12:12:31 JST
LABEL: test2

$ cat test3/backup_label
START WAL LOCATION: 0/20000B0 (file 000000010000000000000002)
CHECKPOINT LOCATION: 0/20000E8
START TIME: 2011-02-01 12:12:31 JST
LABEL: test3

$ ls archive/*.backup
archive/000000010000000000000002.000000B0.backup
--------------------

This would cause a serious problem. Because the backup-end record
which indicates the same "START WAL LOCATION" can be written by the
first backup before the other finishes. So we might think wrongly that
we've already reached a consistency state by reading the backup-end
record (written by the first backup) before reading the last required WAL
file.

                /*
                 * Force a CHECKPOINT.  Aside from being necessary to prevent 
torn
                 * page problems, this guarantees that two successive backup 
runs will
                 * have different checkpoint positions and hence different 
history
                 * file names, even if nothing happened in between.
                 *
                 * We use CHECKPOINT_IMMEDIATE only if requested by user (via 
passing
                 * fast = true).  Otherwise this can take awhile.
                 */
                RequestCheckpoint(CHECKPOINT_FORCE | CHECKPOINT_WAIT |
                                                  (fast ? CHECKPOINT_IMMEDIATE 
: 0));

This problem happens because the above code (in do_pg_start_backup)
actually doesn't ensure that the concurrent backups have the different
checkpoint locations. ISTM that we should change the above or elsewhere
to ensure that. Or we should include backup label name in the backup-end
record, to prevent a recovery from reading not-its-own backup-end record.

Thought?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

not_create_histfile_if_not_arch_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allowing multiple concurrent base backups

Reply via email to