On 2/5/19 8:01 AM, Andres Freund wrote:
> Hi,
>
> On 2019-02-05 06:57:06 +0100, Fabien COELHO wrote:
>>>>> I'm wondering (possibly again) about the existing early exit if one block
>>>>> cannot be read on retry: the command should count this as a kind of bad
>>>>> block, proceed on checking other files, and obviously fail in the end, but
>>>>> having checked everything else and generated a report. I do not think that
>>>>> this condition warrants a full stop. ISTM that under rare race conditions
>>>>> (eg, an unlucky concurrent "drop database" or "drop table") this could
>>>>> happen when online, although I could not trigger one despite heavy
>>>>> testing,
>>>>> so I'm possibly mistaken.
>>>>
>>>> This seems like a defensible judgement call either way.
>>>
>>> Right now we have a few tests that explicitly check that
>>> pg_verify_checksums fail on broken data ("foo" in the file). Those
>>> would then just get skipped AFAICT, which I think is the worse behaviour
>>> , but if everybody thinks that should be the way to go, we can
>>> drop/adjust those tests and make pg_verify_checksums skip them.
>>>
>>> Thoughts?
>>
>> My point is that it should fail as it does, only not immediately (early
>> exit), but after having checked everything else. This mean avoiding calling
>> "exit(1)" here and there (lseek, fopen...), but taking note that something
>> bad happened, and call exit only in the end.
>
> I can see both as being valuable (one gives you a more complete picture,
> the other a quicker answer in scripts). For me that's the point where
> it's the prerogative of the author to make that choice.
>
Why not make this configurable, using a command-line option?
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services