Hi, Am Dienstag, den 05.02.2019, 11:30 +0100 schrieb Tomas Vondra: > On 2/5/19 8:01 AM, Andres Freund wrote: > > On 2019-02-05 06:57:06 +0100, Fabien COELHO wrote: > > > > > > I'm wondering (possibly again) about the existing early exit if one > > > > > > block > > > > > > cannot be read on retry: the command should count this as a kind of > > > > > > bad > > > > > > block, proceed on checking other files, and obviously fail in the > > > > > > end, but > > > > > > having checked everything else and generated a report. I do not > > > > > > think that > > > > > > this condition warrants a full stop. ISTM that under rare race > > > > > > conditions > > > > > > (eg, an unlucky concurrent "drop database" or "drop table") this > > > > > > could > > > > > > happen when online, although I could not trigger one despite heavy > > > > > > testing, > > > > > > so I'm possibly mistaken. > > > > > > > > > > This seems like a defensible judgement call either way. > > > > > > > > Right now we have a few tests that explicitly check that > > > > pg_verify_checksums fail on broken data ("foo" in the file). Those > > > > would then just get skipped AFAICT, which I think is the worse behaviour > > > > , but if everybody thinks that should be the way to go, we can > > > > drop/adjust those tests and make pg_verify_checksums skip them. > > > > > > > > Thoughts? > > > > > > My point is that it should fail as it does, only not immediately (early > > > exit), but after having checked everything else. This mean avoiding > > > calling > > > "exit(1)" here and there (lseek, fopen...), but taking note that something > > > bad happened, and call exit only in the end. > > > > I can see both as being valuable (one gives you a more complete picture, > > the other a quicker answer in scripts). For me that's the point where > > it's the prerogative of the author to make that choice.
Personally, I would prefer to keep it as simple as possible for now and get this patch committed; in my opinion the behaviour is already like this (early exit on corrupt files) so I don't think the online verification patch should change this. If we see complaints about this, then I'd be happy to change it afterwards. > Why not make this configurable, using a command-line option? I like this even less - this tool is about verifying checksums, so adding options on what to do when it encounters broken pages looks out- of-scope to me. Unless we want to say it should generally abort on the first issue (i.e. on wrong checksums as well). Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutz