Re: reorder pg_rewind control file sync

Fabien COELHO Mon, 25 Mar 2019 02:31:33 -0700


Bonjour Michaël,

The attached patch reorders the cluster fsyncing and control file changes in
"pg_rewind" so that the later is done after all data are committed to disk,
so as to reflect the actual cluster status, similarly to what is done by
"pg_checksums", per discussion in the thread about offline enabling of
checksums:


It would be an interesting property to see that it is possible to
retry a rewind of a node which has been partially rewound already,
but the operation failed in the middle.

Yes. I understand that the question is whether the Warning in pg_rewinddocumentation can be partially lifted. The short answer is that it is notobvious.

Because that's the real deal here: as long as we know that its controlfile is in its previous state, we can rely on it for retrying theoperation. Logically, I think that it should work, because we wouldstill try to fetch the same blocks from the source server since WAL hasforked by looking at the records of the target up from the lastcheckpoint before WAL has forked up to the last shutdown checkpoint, andthe operation is lossy by design when it comes to deal with filedifferences.
Have you tried to see if pg_rewind is able to repeat its operation for
specific scenarios?

I have run the non regression tests. I'm not sure of what scenarii arecovered there, but probably not an interruption in the middle of a fsync,specially if fsync is usually disabled to ease the tests:-)

One is for example a database created on the promoted standby, used as asource, and a second, different database created on the primary afterthe standby has been promoted. You could make the tool exit() beforethe rewind finishes, just before updating the control file, and see ifthe operation is repeatable. Interrupting the tool would be fine aswell, still less controllable.
It would be good to mention in the patch why the order matters.

Yep. This requires a careful analysis of pg_rewind inner working, that Ido not have to do in the short terme.


--
Fabien.

Re: reorder pg_rewind control file sync

Reply via email to