Gregory Stark wrote:
If we could have autovacuum interrupt a vacuum in mid-sweep, perform a cycle
of vacuums on smaller tables, then resume, that problem would go away. That
sounds too difficult though, but perhaps we could do something nearly as good.

I think to make vacuum has this interrupted-resume capability is quite useful for large table.

It can provide more flexibility for autovacuum to create a good schedule scheme. Sometimes it takes a whole day to vacuum the large table (Hundreds-GB table may qualify); setting the cost_delay make it even lasts for several days. If the system has maintenance time, vacuum task of the large table can be split to fit into the maintenance time by interrupted-resume feature.

One option that I've heard before is to have vacuum after a single iteration
(ie, after it fills maintenance_work_mem and does the index cleanup and the
second heap pass), remember where it was and pick up from that point next
time.


Even a single iteration may take a long time, so it is not so much useful to have a break in the boundary of the iteration. I think it is not so difficult to get vacuum to remember where it leaves and start from where it leaves last time. The following is a basic idea.

A typical vacuum process mainly have the following phases:
 Phase 1. scan heap
 Phase 2. scan and sweep index
 Phase 3. sweep heap
 Phase 4. update FSM
 Phase 5. truncate CLOG

Where vacuum is interrupted, we can just save the collected information into the disk, and restore it later when vacuum restarts. When vacuum process is interrupted, we can remember the dead tuple list and the block number it has scanned in phase 1; the indexes it has cleanup in phase 2; the tuples it has swept in phase 3. Before exiting from vacuum, we can also merge the free space information into FSM.

We are working on this feature now. I will propose it latter to discuss with you.

Best Regards
Galy Lee
--
NTT Open Source Software Center

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

               http://www.postgresql.org/about/donate

Reply via email to