> On 29 Jun 2022, at 17:43, Robins Tharakan <thara...@gmail.com> wrote:


Sorry to bump ancient thread, I have some observations that might or might not 
be relevant.
Recently we noticed a corruption on one of clusters. The corruption at hand is 
not in system catalog, but in user indexes.
The cluster was correctly configured: checksums, fsync, FPI etc.
The cluster never was restored from a backup. It’s a single-node cluster, so it 
was not ever promoted, pg_rewind-ed etc. VM had never been rebooted.

But, the cluster had been experiencing 10 OOMs a day. There were no torn pages, 
no checsum erros at log at all. Yet, B-tree indexes became corrupted.


Sorry for this wall of text, I’m posing everything as-is in case if there is 
some useful information.

$ /etc/cron.yandex/pg_corruption_check.py --index
2024-03-01 11:54:05,075 ERROR : Corrupted index: 96009 
table1_table1message_table1_team_identity_06a95642 XX002 ERROR: posting list 
contains misplaced TID in index 
"table1_table1message_table1_team_identity_06a95642" DETAIL: Index tid=(267,34) 
posting list offset=137 page lsn=31B/62159608.
2024-03-01 11:54:05,100 ERROR : Corrupted index: 96008 
table1_table1message_organization_id_66c18ed2 XX002 ERROR: posting list 
contains misplaced TID in index "table1_table1message_organization_id_66c18ed2" 
DETAIL: Index tid=(267,34) posting list offset=137 page lsn=31B/62158BC8.
2024-03-01 11:54:05,355 ERROR : Corrupted index: 95804 
table2_aler_channel_81aeec_idx XX002 ERROR: posting list contains misplaced TID 
in index "table2_aler_channel_81aeec_idx" DETAIL: Index tid=(336,7) posting 
list offset=182 page lsn=314/9B794248.
2024-03-01 11:54:05,716 ERROR : Corrupted index: 95816 
table2_table3_channel_id_91a1912f XX002 ERROR: posting list contains misplaced 
TID in index "table2_table3_channel_id_91a1912f" DETAIL: Index tid=(384,2) 
posting list offset=72 page lsn=317/3F14F390.
2024-03-01 11:54:06,068 ERROR : Corrupted index: 95815 
table2_table3_channel_filter_id_6706c8b6 XX002 ERROR: posting list contains 
misplaced TID in index "table2_table3_channel_filter_id_6706c8b6" DETAIL: Index 
tid=(380,2) posting list offset=72 page lsn=317/3F0D8E30.
2024-03-01 11:54:06,302 ERROR : Corrupted index: 95824 
table2_table3_root_alert_group_id_f327f122 XX002 ERROR: item order invariant 
violated for index "table2_table3_root_alert_group_id_f327f122" DETAIL: Lower 
index tid=(368,204) (points to heap tid=(48901,2)) higher index tid=(368,205) 
(points to heap tid=(48901,2)) page lsn=319/3C234588.
2024-03-01 11:54:06,538 ERROR : Corrupted index: 95810 
table2_table3_acknowledged_by_user_id_dd6723dc XX002 ERROR: posting list 
contains misplaced TID in index 
"table2_table3_acknowledged_by_user_id_dd6723dc" DETAIL: Index tid=(380,69) 
posting list offset=35 page lsn=317/C14E2D50.
2024-03-01 11:54:06,775 ERROR : Corrupted index: 95825 
table2_table3_silenced_by_user_id_40a833a1 XX002 ERROR: posting list contains 
misplaced TID in index "table2_table3_silenced_by_user_id_40a833a1" DETAIL: 
Index tid=(371,11) posting list offset=144 page lsn=318/61171918.
2024-03-01 11:54:07,009 ERROR : Corrupted index: 95829 
table2_table3_wiped_by_id_4326ff61 XX002 ERROR: item order invariant violated 
for index "table2_table3_wiped_by_id_4326ff61" DETAIL: Lower index tid=(373,97) 
(points to heap tid=(48901,2)) higher index tid=(373,98) (points to heap 
tid=(48901,2)) page lsn=318/61172788.
2024-03-01 11:54:07,245 ERROR : Corrupted index: 95823 
table2_table3_resolved_by_user_id_463cdf3d XX002 ERROR: posting list contains 
misplaced TID in index "table2_table3_resolved_by_user_id_463cdf3d" DETAIL: 
Index tid=(375,89) posting list offset=144 page lsn=319/3C1DCFC8.
2024-03-01 11:54:07,479 ERROR : Corrupted index: 95819 
table2_table3_maintenance_uuid_9a7b8529_like XX002 ERROR: item order invariant 
violated for index "table2_table3_maintenance_uuid_9a7b8529_like" DETAIL: Lower 
index tid=(372,4) (points to heap tid=(48901,2)) higher index tid=(372,5) 
(points to heap tid=(48901,2)) page lsn=317/C1A210A8.
2024-03-01 11:54:07,717 ERROR : Corrupted index: 95827 
table2_table3_table1_message_id_58a31784_like XX002 ERROR: posting list 
contains misplaced TID in index "table2_table3_table1_message_id_58a31784_like" 
DETAIL: Index tid=(373,89) posting list offset=144 page lsn=319/3C3EE660.
2024-03-01 11:54:08,162 ERROR : Corrupted index: 96066 
webhooks_webhookresponse_webhook_id_db49ebcd XX002 ERROR: item order invariant 
violated for index "webhooks_webhookresponse_webhook_id_db49ebcd" DETAIL: Lower 
index tid=(522,24) (points to heap tid=(73981,1)) higher index tid=(522,25) 
(points to heap tid=(73981,1)) page lsn=31B/E522B640.
2024-03-01 11:54:08,646 ERROR : Corrupted index: 95822 
table2_table3_resolved_by_alert_id_bbdf0a83 XX002 ERROR: posting list contains 
misplaced TID in index "table2_table3_resolved_by_alert_id_bbdf0a83" DETAIL: 
Index tid=(618,2) posting list offset=150 page lsn=317/C1DE74B8.
2024-03-01 11:54:08,873 ERROR : Corrupted index: 95427 
table2_table3_table1_message_id_key XX002 ERROR: item order invariant violated 
for index "table2_table3_table1_message_id_key" DETAIL: Lower index 
tid=(369,134) (points to heap tid=(48901,2)) higher index tid=(369,135) (points 
to heap tid=(48901,2)) page lsn=319/3B629E58.
2024-03-01 11:54:09,108 ERROR : Corrupted index: 95417 
table2_table3_maintenance_uuid_key XX002 ERROR: posting list contains misplaced 
TID in index "table2_table3_maintenance_uuid_key" DETAIL: Index tid=(371,42) 
posting list offset=47 page lsn=318/6116FC50.
2024-03-01 11:54:10,180 ERROR : Corrupted index: 95826 
table2_table3_table1_log_message_id_587aaa8d_like XX002 ERROR: posting list 
contains misplaced TID in index 
"table2_table3_table1_log_message_id_587aaa8d_like" DETAIL: Index tid=(849,19) 
posting list offset=79 page lsn=319/3C389B60.
2024-03-01 11:54:10,689 ERROR : Corrupted index: 95820 
table2_table3_mattermost_log_message_id_69bc2ae4_like XX002 ERROR: item order 
invariant violated for index 
"table2_table3_mattermost_log_message_id_69bc2ae4_like" DETAIL: Lower index 
tid=(559,4) (points to heap tid=(48901,2)) higher index tid=(559,5) (points to 
heap tid=(48901,2)) page lsn=317/C1A7BA50.
2024-03-01 11:54:11,760 ERROR : Corrupted index: 95425 
table2_table3_table1_log_message_id_key XX002 ERROR: item order invariant 
violated for index "table2_table3_table1_log_message_id_key" DETAIL: Lower 
index tid=(849,22) (points to heap tid=(48901,2)) higher index tid=(849,23) 
(points to heap tid=(48901,2)) page lsn=317/3E7EC1F0.
2024-03-01 11:54:12,282 ERROR : Corrupted index: 95419 
table2_table3_mattermost_log_message_id_key XX002 ERROR: posting list contains 
misplaced TID in index "table2_table3_mattermost_log_message_id_key" DETAIL: 
Index tid=(566,84) posting list offset=65 page lsn=319/3B1901F8.
2024-03-01 11:54:17,990 ERROR : Corrupted index: 95423 
table2_table3_public_primary_key_key XX002 ERROR: cross page item order 
invariant violated for index "table2_table3_public_primary_key_key" DETAIL: 
Last item on page tid=(727,146) page lsn=31B/E104D660.


Most of these messages look similar, except last one: “cross page item order 
invariant violated for index”. Indeed, index scans were hanging in a cycle.
I could not locate problem in WAL yet, because a lot of other stuff is going 
on. But I have no other ideas, but suspect that posting list redo is corrupting 
index in case of a crash.

Thanks!


Best regards, Andrey Borodin.

Reply via email to