> On Feb 21, 2025, at 12:16 PM, Mark Dilger <mark.dil...@enterprisedb.com>
> wrote:
>
> The pgbench script is not corrupting anything overtly, so this looks to
> either be a bug in gin or a bug in the check.
I suspected the AccessShareLock taken by verify_gin() might be too weak, and
upgraded that to ShareRowExclusiveLock so as to prevent the concurrent table
modifications (and incidentally other concurrent verify_gin() calls), but to my
surprise that didn't fix anything. Even AccessExclusiveLock doesn't fix it.
So this seems to either be a bug in the checking code complaining about
perfectly valid tuple order, or a bug in Gin corrupting its own entry tree page.
On successive runs, (instrumented to print out a bit more info), there doesn't
seem to be any obvious pattern in where the corruption occurs. The offset in
the page changes, neither always being at the beginning, nor always at the
maxoff; likewise the block where corruption is detected changes from run to
run. I've noticed that the rightlink for the page is always the page's block
number plus one, but that might just be that I haven't run enough iterations
yet to see counter-examples.
Could one of the patch authors take a look? I don't have the time to chase
this to conclusion just now. Thanks.
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company