Hi, On 2019-02-05 06:57:06 +0100, Fabien COELHO wrote: > > > > I'm wondering (possibly again) about the existing early exit if one > > > > block > > > > cannot be read on retry: the command should count this as a kind of bad > > > > block, proceed on checking other files, and obviously fail in the end, > > > > but > > > > having checked everything else and generated a report. I do not think > > > > that > > > > this condition warrants a full stop. ISTM that under rare race > > > > conditions > > > > (eg, an unlucky concurrent "drop database" or "drop table") this could > > > > happen when online, although I could not trigger one despite heavy > > > > testing, > > > > so I'm possibly mistaken. > > > > > > This seems like a defensible judgement call either way. > > > > Right now we have a few tests that explicitly check that > > pg_verify_checksums fail on broken data ("foo" in the file). Those > > would then just get skipped AFAICT, which I think is the worse behaviour > > , but if everybody thinks that should be the way to go, we can > > drop/adjust those tests and make pg_verify_checksums skip them. > > > > Thoughts? > > My point is that it should fail as it does, only not immediately (early > exit), but after having checked everything else. This mean avoiding calling > "exit(1)" here and there (lseek, fopen...), but taking note that something > bad happened, and call exit only in the end.
I can see both as being valuable (one gives you a more complete picture, the other a quicker answer in scripts). For me that's the point where it's the prerogative of the author to make that choice. Greetings, Andres Freund