Hi Ayush, On 8/13/24 07:37, Ayush Tiwari wrote: > Hi hackers, > > We encountered an issue lately, that if the database grants too many > roles `datacl` is toasted, following which, the drop database command > will fail with error "wrong tuple length". > > To reproduce the issue, please follow below steps: > > CREATE DATABASE test; > > -- create helper function > CREATE OR REPLACE FUNCTION data_tuple() returns text as $body$ > declare > mycounter int; > begin > for mycounter in select i from generate_series(1,2000) i loop > execute 'CREATE > ROLE aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbb ' || mycounter; > execute 'GRANT ALL ON DATABASE test to > aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbb ' || mycounter; > end loop; > return 'ok'; > end; > $body$ language plpgsql volatile strict; > > -- create roles and grant on the database. > SELECT data_tuple(); > > -- drop database command, this will result in "wrong tuple length" error. > DROP DATABASE test; > > The root cause of this behaviour is that the HeapTuple in dropdb > function fetches a copy of pg_database tuple from system cache. > But the system cache flattens any toast attributes, which cause the > length check to fail in heap_inplace_update. > > A patch for this issue is attached to the mail, the solution is to > change the logic to fetch the tuple by directly scanning pg_database > rather than using the catcache. >
Thanks for the report. I can reproduce the issue following your instructions, and the fix seems reasonable ... But there's also one thing I don't quite understand. I did look for other places that might have a similar issue, that is places that 1) lookup tuple using SearchSysCacheCopy1 2) call on the tuple heap_inplace_update And I found about four places doing that: - index_update_stats (src/backend/catalog/index.c) - create_toast_table (src/backend/catalog/toasting.c) - vac_update_relstats / vac_update_datfrozenxid (commands/vacuum.c) But I haven't managed to trigger the same kind of failure for any of those places, despite trying. AFAIK that's because those places update pg_class, and that doesn't have TOAST, so the tuple length can't change. So this fix seems reasonable. -- Tomas Vondra