Re: REINDEX backend filtering

Mark Dilger Mon, 15 Mar 2021 11:33:02 -0700

> On Mar 15, 2021, at 11:10 AM, Julien Rouhaud <rjuju...@gmail.com> wrote:
> 
> On Mon, Mar 15, 2021 at 10:56:50AM -0700, Mark Dilger wrote:
>> 
>> 
>>> On Mar 15, 2021, at 10:50 AM, Julien Rouhaud <rjuju...@gmail.com> wrote:
>>> 
>>> On Mon, Mar 15, 2021 at 10:40:25AM -0700, Mark Dilger wrote:
>>>> I'm saying that your patch seems to call down to 
>>>> get_collation_actual_version() via get_collation_version_for_oid() from 
>>>> your new function do_check_index_has_outdated_collation(), but I'm not 
>>>> seeing how that gets exercised.
>>> 
>>> It's a little bit late here so sorry if I'm missing something.
>>> 
>>> do_check_index_has_outdated_collation() is called from
>>> index_has_outdated_collation() which is called from
>>> index_has_outdated_dependency() which is called from
>>> RelationGetIndexListFiltered(), and that function is called when passing the
>>> OUTDATED option to REINDEX (and reindexdb --outdated).  So this is exercised
>>> with added tests for both matching and non matching collation version.
>> 
>> Ok, fair enough.  I was thinking about the case where the collation actually 
>> returns a different version number because it (the C library providing the 
>> collation) got updated, but I think you've answered already that you are not 
>> planning to test that case, only the case where pg_depend is modified to 
>> have a bogus version number.
> 
> This infrastructure is supposed to detect that the collation library *used to*
> return a different version before it was updated.  And that's exactly what
> we're testing by manually updating the refobjversion.
> 
>> It seems a bit odd to me that a feature intended to handle cases where 
>> collations are updated is not tested via having a collation be updated 
>> during the test.  It leaves open the possibility that something differs 
>> between the test and reindexed run after real world collation updates.  But 
>> that's for the committer who picks up your patch to decide, and perhaps it 
>> is unfair to make your patch depend on addressing that issue.
> 
> Why is that odd?  We're testing that we're correctly storing the collation
> version during index creating and correctly detecting a mismatch.  Having a
> fake collation provider to return a fake version number won't add any more
> coverage unless I'm missing something.
> 
> It's similar to how we test the various corruption scenario.  AFAIK we're not
> providing custom drivers to write corrupted data but we're simply simulating a
> corruption overwriting some blocks.

We do test corrupt relations.  We intentionally corrupt the pages within 
corrupted heap tables to check that they get reported as corrupt.  (See 
src/bin/pg_amcheck/t/004_verify_heapam.pl)  Admittedly, the corruptions used in 
the tests are not necessarily representative of corruptions that might occur in 
the wild, but that is a hard problem to solve, since we don't know the 
statistical distribution of corruptions in the wild.

If you had a real, not fake, collation provider which actually provided a 
collation with an actual version number, stopped the server, changed the 
behavior of the collation as well as its version number, started the server, 
and ran REINDEX (OUTDATED), I think that would be a more real-world test.  I'm 
not demanding that you write such a test.  I'm just saying that it is strange 
that we don't have coverage for this anywhere, and was asking if you think 
there is such coverage, because, you know, maybe I just didn't see where that 
test was lurking.


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Re: REINDEX backend filtering

Reply via email to