[GENERAL] reclaiming diskspace bloat w/near-zero downtime

Ed L. Thu, 02 Dec 2004 22:08:07 -0800

I need to reclaim the diskspace from a heavily updated 7.3.4 table which has 
grown 95% bloated to 20gb of disk (indices and toast included), and I need 
to reclaim it while the table continues to get updates, and without 
interrupting/delaying access more than a few seconds.  This is an 
explanation of what I've tried so far along with some observations and a 
request for other ideas to try.


Some constraints on my efforts:  Stopping the cluster for any reason is 
pretty undesirable, as is vacuum full.  Upgrading to 7.4.6 via slony is an 
option, but I'm curious if there is an easier way.

I assume this bloat is basically due to the garbage generation outpacing the 
garbage collection.  The bloat does not appear to be in indices, so the 
index bloat issue does not seem to be in play much here.  The table gets 
about 5% INSERTs, 95% UPDATEs, and 0% DELETEs, and UPDATEs are not changing 
indexed values, so maybe that makes sense that indices are not bloated.   
The UPDATEs to the table consist of appending text to a text colum (UPDATE 
foo SET value = 'new text' || value).  I had max_fsm_pages at 4M (for upto 
32GB of disk?) and 8K (8000 tables, etc), both far more than I needed 
initially, but it still bloated. [SIDEBAR:  Vacuum + fsm is not working as 
I expected; it is clearly not reclaiming space resulting from the UPDATEs.  
If I UPDATE 10000 rows and then run vacuum, I was expecting/hoping that a 
subsequent UPDATE of 9000 rows would largely reuse the space reclaimed from 
the 10000-row UPDATE.]

Anyway, I need to reclaim the space without stopping the cluster and without 
blocking access to the table for more than a few seconds.  I seem to have 
found a way to do that for any table ('foo') that doesn't have fkey/func 
dependencies:

        create table foo_slim as exact schema of foo
        begin
                alter table foo rename foo_bloated
                create view foo as select from foo_slim union all foo_bloated
                create rules:
                        on insert to foo:  insert into foo_slim
                        on delete to foo:  delete from foo_bloated or foo_slim
                        on update to foo:  if in foo_bloated, 
                                insert new values into foo_slim
                                delete from foo_bloated
        commit;

Once that's done, then the data begins to trickle into the new, slim table, 
and the xfer can be speeded up by doing no-op updates on the foo view or a 
select-for-update function on the foo_bloated table.  Once all the data has 
migrated over to foo_slim, the initial state can be restored with:

        begin
                drop view foo
                alter table foo_slim rename to foo
        commit;

and then foo_bloated can be dropped.  That seems to work well enough.  

But how to do it on a table with foreign keys and triggers, etc?  I was 
wondering if I could use the same basic approach but manually reconstitute 
the oid linkages so that the triggers and functions stayed intact even 
while renaming/dropping/etc, but sounds a little dicey.  Any other ideas?

Thanks,
Ed


---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
      joining column's datatypes do not match

[GENERAL] reclaiming diskspace bloat w/near-zero downtime

Reply via email to