On Wed, Mar 13, 2019 at 10:08:33AM +0100, Fabien COELHO wrote: > I'm not sure of the punctuation logic on the help line: the first sentence > does not end with a ".". I could not find an instance of this style in other > help on pg commands. I'd suggest "check data checksums (default)" would work > around and be more in line with other commands help.
Good idea, let's do that. > I slowed down pg_checksums by adding a 0.1s sleep when scanning a new file, > then started a "pg_checksums --enable" on a stopped cluster, then started > the cluster while the enabling was in progress, then connected and updated > data. Well, yes, don't do that. You can get into the same class of problems while running pg_rewind, pg_basebackup or even pg_resetwal once the initial control file check is done for each one of these tools. > I do not think it is a good thing that two commands can write to the data > directory at the same time, really. We don't prevent either a pg_resetwal and a pg_basebackup to run in parallel. That would be... Interesting. > About fsync-ing: ISTM that it is possible that the control file is written > to disk while data are still not written, so a failure in between would > leave the cluster with an inconsistent state. I think that it should fsync > the data *then* update the control file and fsync again on that one. If --disable is used, the control file gets updated at the end without doing anything else. If the host crashes, it could be possible that the control file has checksums enabled or disabled. If the state is disabled, then well it succeeded. If the state is enabled, then the control file is still correct, because all the other blocks still have checksums set. if --enable is used, we fsync the whole data directory after writing all the blocks and updating the control file at the end. The case you are referring to here is in fsync_pgdata(), not pg_checksums actually, because you could reach the same state after a simple initdb. It could be possible to reach a state where the control file has checksums enabled and some blocks are not correctly synced, still you would notice rather quickly if the server is in an incorrect state at the follow-up startup. -- Michael
signature.asc
Description: PGP signature