> On Mar 15, 2021, at 11:10 AM, Julien Rouhaud <rjuju...@gmail.com> wrote:
>
> On Mon, Mar 15, 2021 at 10:56:50AM -0700, Mark Dilger wrote:
>>
>>
>>> On Mar 15, 2021, at 10:50 AM, Julien Rouhaud <rjuju...@gmail.com> wrote:
>>>
>>> On Mon, Mar 15, 2021 at 10:40:25AM -0700, Mark Dilger wrote:
>>>> I'm saying that your patch seems to call down to
>>>> get_collation_actual_version() via get_collation_version_for_oid() from
>>>> your new function do_check_index_has_outdated_collation(), but I'm not
>>>> seeing how that gets exercised.
>>>
>>> It's a little bit late here so sorry if I'm missing something.
>>>
>>> do_check_index_has_outdated_collation() is called from
>>> index_has_outdated_collation() which is called from
>>> index_has_outdated_dependency() which is called from
>>> RelationGetIndexListFiltered(), and that function is called when passing the
>>> OUTDATED option to REINDEX (and reindexdb --outdated). So this is exercised
>>> with added tests for both matching and non matching collation version.
>>
>> Ok, fair enough. I was thinking about the case where the collation actually
>> returns a different version number because it (the C library providing the
>> collation) got updated, but I think you've answered already that you are not
>> planning to test that case, only the case where pg_depend is modified to
>> have a bogus version number.
>
> This infrastructure is supposed to detect that the collation library *used to*
> return a different version before it was updated. And that's exactly what
> we're testing by manually updating the refobjversion.
>
>> It seems a bit odd to me that a feature intended to handle cases where
>> collations are updated is not tested via having a collation be updated
>> during the test. It leaves open the possibility that something differs
>> between the test and reindexed run after real world collation updates. But
>> that's for the committer who picks up your patch to decide, and perhaps it
>> is unfair to make your patch depend on addressing that issue.
>
> Why is that odd? We're testing that we're correctly storing the collation
> version during index creating and correctly detecting a mismatch. Having a
> fake collation provider to return a fake version number won't add any more
> coverage unless I'm missing something.
>
> It's similar to how we test the various corruption scenario. AFAIK we're not
> providing custom drivers to write corrupted data but we're simply simulating a
> corruption overwriting some blocks.
We do test corrupt relations. We intentionally corrupt the pages within
corrupted heap tables to check that they get reported as corrupt. (See
src/bin/pg_amcheck/t/004_verify_heapam.pl) Admittedly, the corruptions used in
the tests are not necessarily representative of corruptions that might occur in
the wild, but that is a hard problem to solve, since we don't know the
statistical distribution of corruptions in the wild.
If you had a real, not fake, collation provider which actually provided a
collation with an actual version number, stopped the server, changed the
behavior of the collation as well as its version number, started the server,
and ran REINDEX (OUTDATED), I think that would be a more real-world test. I'm
not demanding that you write such a test. I'm just saying that it is strange
that we don't have coverage for this anywhere, and was asking if you think
there is such coverage, because, you know, maybe I just didn't see where that
test was lurking.
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company