On Mon, 27 Feb 2006, Tom Lane wrote:

Heikki Linnakangas <[EMAIL PROTECTED]> writes:
On Mon, 27 Feb 2006, Tom Lane wrote:
This strikes me as a fairly bad idea, because it makes VACUUM dependent
on correct functioning of user-written code --- consider a functional
index involving a user-written function that was claimed to be immutable
and is not.

If the user-defined function is broken, you're in more or less trouble
anyway.

Less.  A non-immutable function might result in lookup failures (not
finding the row you need) but not in database corruption, which is what
would ensue if VACUUM fails to remove an index tuple.  The index entry
would eventually point to a wrong table entry, after the table item slot
gets recycled for another tuple.

It's easy to detect when it happens (that you don't find the row in the index). You can then complain loudly or fall back to the full scan.

Moreover, you haven't pointed to any strong reason to adopt this
methodology.  It'd only be a win when vacuuming pretty small numbers
of tuples, which is not the design center for VACUUM, and isn't likely
to be the case in practice either if you're using autovacuum.  If you're
removing say 1% of the tuples, you are likely to be hitting every index
page to do it, meaning that the scan approach will be significantly
*more* efficient than retail lookups.

It really depends a lot on what kind of indexes the table has, clustering order, and what kind of deletions and updates happen. Assuming a random distribution of dead tuples, a scan is indeed going to be more efficient as you say.

Assuming a tightly packed bunch of dead tuples, however, say after a "delete from log where timestamp < now() - 1 month", you would benefit from a partial vacuum.

That's certainly something that needs to be tested. It's important to implement so that it falls back to full scan when it's significantly faster.

- Heikki

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Reply via email to