On Wed, Mar 10, 2021 at 6:52 AM Dilip Kumar <dilipbal...@gmail.com> wrote: > The pending comment is providing a way to rewrite a table and > re-compress the data with the current compression method.
I spent some time poking at this yesterday and ran couldn't figure out what was going on here. There are two places where we rewrite tables. One is the stuff in cluter.c, which handles VACUUM FULL and CLUSTER. That eventually calls reform_and_rewrite_tuple(), which deforms the old tuple and creates a new one, but it doesn't seem like there's anything in there that would expand toasted values, whether external or inline compressed. But I think that can't be right, because it seems like then you'd end up with toast pointers into the old TOAST relation, not the new one, which would cause failures later. So I must be missing something here. The other place where we rewrite tables is in ATRewriteTable() as part of the ALTER TABLE machinery. I don't see anything there to force detoasting either. That said, I think that using the word REWRITE may not really capture what we're on about. Leaving aside the question of exactly what the CLUSTER code does today, you could in theory rewrite the main table by just taking all the tuples and putting them into a new relfilenode. And then you could do the same thing with the TOAST table. And despite having fully rewritten both tables, you wouldn't have done anything that helps with this problem because you haven't deformed the tuples at any point. Now as it happens we do have code -- in reform_and_rewrite_tuple() -- that does deform and reform the tuples, but it doesn't take care of this problem either. We might need to distinguish between rewriting the table, which is mostly about getting a new relfilenode, and some other word that means doing this. But, I am not really convinced that we need to solve this problem by adding new ALTER TABLE syntax. I'd be happy enough if CLUSTER, VACUUM FULL, and versions of ALTER TABLE that already force a rewrite would cause the compression to be redone also. Honestly, even if the user had to fall back on creating a new table and doing INSERT INTO newtab SELECT * FROM oldtab I would consider that to be not a total showstopper for this .. assuming of course that it actually works. If it doesn't, we have big problems. Even without the pg_am stuff, we still need to make sure that we don't just blindly let compressed values wander around everywhere. When we insert into a table column with a compression method, we should recompress any data that is compressed using some other method. -- Robert Haas EDB: http://www.enterprisedb.com