On Wed, Oct 25, 2023 at 01:39:41PM +0300, Smolkin Grigory wrote: > We are running PG13.10 and recently we have encountered what appears to be > a bug due to some race condition between ALTER TABLE ... ADD CONSTRAINT and > some other catalog-writer, possibly ANALYZE. > The problem is that after successfully creating index on relation (which > previosly didnt have any indexes), its pg_class.relhasindex remains set to > "false", which is illegal, I think. > Index was built using the following statement: > ALTER TABLE "example" ADD constraint "example_pkey" PRIMARY KEY (id);
This is going to be a problem with any operation that does a transactional pg_class update without taking a lock that conflicts with ShareLock. GRANT doesn't lock the table at all, so I can reproduce this in v17 as follows: == session 1 create table t (c int); begin; grant select on t to public; == session 2 alter table t add primary key (c); == back in session 1 commit; We'll likely need to change how we maintain relhasindex or perhaps take a lock in GRANT. > Looking into the WAL via waldump given us the following picture (full > waldump output is attached): > 1202295045 - create index statement > 1202298790 and 1202298791 are some other concurrent operations, > unfortunately I wasnt able to determine what are they Can you explore that as follows? - PITR to just before the COMMIT record. - Save all rows of pg_class. - PITR to just after the COMMIT record. - Save all rows of pg_class. - Diff the two sets of saved rows. Which columns changed? The evidence you've shown would be consistent with a transaction doing GRANT or REVOKE on dozens of tables. If the changed column is something other than relacl, that would be great to know. On the off-chance it's relevant, what extensions do you have (\dx in psql)?