Hi,
On Tue, Mar 16, 2021 at 1:44 PM Peter Geoghegan <p...@bowt.ie> wrote: > On Tue, Mar 16, 2021 at 5:01 AM Avinash Kumar > <avinash.vallar...@gmail.com> wrote: > > I am afraid that it looks to me like a deduplication bug but not sure > how this can be pin-pointed. If there is something I could do to determine > that, I would be more than happy. > > That cannot be ruled out, but I don't consider it to be the most > likely explanation. The index in question passes amcheck verification, > which includes verification of the posting list tuple structure, and > even includes making sure the index has an entry for each row from the > table. It's highly unlikely that it is corrupt, and it's hard to see > how you get from a non-corrupt index to the segfault. At the same time > we see that some other index is corrupt -- it fails amcheck due to a > cross-level inconsistency, which is very unlikely to be related to > deduplication in any way. It's hard to believe that the problem is > squarely with _bt_swap_posting(). > > Did you actually run amcheck on the failed-over server, not the original > server? > Yes, it was on the failover-over server where the issue is currently seen. Took a snapshot of the data directory so that the issue can be analyzed. > > Note that you can disable deduplication selectively -- perhaps doing > so will make it possible to isolate the issue. Something like this > should do it (you need to reindex here to actually change the on-disk > representation to not have any posting list tuples from > deduplication): > > alter index idx_id_mtime set (deduplicate_items = off); > reindex index idx_id_mtime; > I can do this. But, to add here, when we do a pg_repack or rebuild of Indexes, automatically this is resolved. But, not sure if we get the same issue again. > > -- > Peter Geoghegan > -- Regards, Avi.