Logical replication, need to reclaim big disk space

Moreno Andreo Fri, 16 May 2025 08:46:25 -0700

Hi,

we are moving our old binary data approach, moving them from byteafields in a table to external storage (making database smaller andrelated operations faster and smarter).In short, we have a job that runs in background and copies data from thetable to an external file and then sets the bytea field to NULL.

(UPDATE tbl SET blob = NULL, ref = 'path/to/file' WHERE id = <uuid>)

This results, at the end of the operations, to a table that's less thanone tenth in size.We have a multi-tenant architecture (100s of schemas with identicalarchitecture, all inheriting from public) and we are performing the taskon one table per schema.


The problem is: this is generating BIG table bloat, as you may imagine.

Running a VACUUM FULL on an ex-22GB table on a standalone test server isalmost immediate.If I had only one server, I'll process a table a time, with a nightlyscript, and issue a VACUUM FULL to tables that have already been processed.

But I'm in a logical replication architecture (we are using amultimaster system called pgEdge, but I don't think it will make bigdifference, since it's based on logical replication), and I'm building atest cluster.

I've been instructed to issue VACUUM FULL on both nodes, nightly, butbefore proceeding I read on docs that VACUUM FULL can disrupt logicalreplication, so I'm a bit concerned on how to proceed. Rows are clearedone a time (one transaction, one row, to keep errors to the record thatissued them)

I read about extensions like pg_squeeze, but I wonder if they are stillnot dangerous for replication.


Thanks for your help.
Moreno.-

Logical replication, need to reclaim big disk space

Reply via email to