On 14.07.2011 18:57, Pavan Deolasee wrote:
On Thu, Jul 14, 2011 at 11:46 AM, Simon Riggs<si...@2ndquadrant.com>  wrote:
I'd say that seems way too complex for such a small use case and we've
only just fixed the bugs from 8.4 vacuum map complexity. The code's
looking very robust now and I'm uneasy that such changes are really
worth it.

Thanks Simon for looking at the patch.

I am not sure if the use case is really narrow. Today, we dirty the pages in
both the passes and also emit WAL records. Just the heap scan can take a
very long time for large tables, blocking the autovacuum worker threads from
doing useful work on other tables. If I am not wrong, we use ring buffers
for vacuum which would most-likely force those buffers to be written/read
twice to the disk.

Seems worthwhile to me. What bothers me a bit is the need for the new 64-bit LSN value on each heap page. Also, note that temporary tables are not WAL-logged, so there's no LSNs.

How does this interact with the visibility map? If you set the visibility map bit after vacuuming indexes, a subsequent vacuum will not visit the page. The second vacuum will update relindxvacxlogid/off, but it will not clean up the dead line pointers left behind by the first vacuum. Now the LSN on the page differs from the one stored in pg_class, so subsequent pruning will not remove the dead line pointers either. I think you can sidestep that if you check that the page's vacuum LSN <= vacuum LSN in pg_class, instead of equality.

Ignoring the issue stated in previous paragraph, I think you wouldn't actually need an 64-bit LSN. A smaller counter is enough, as wrap-around doesn't matter. In fact, a single bit would be enough. After a successful vacuum, the counter on each heap page (with dead line pointers) is N, and the value in pg_class is N. There are no other values on the heap, because vacuum will have cleaned them up. When you begin the next vacuum, it will stamp pages with N+1. So at any stage, there is only one of two values on any page, so a single bit is enough. (But as I said, that doesn't hold if vacuum skips some pages thanks to the visibility map)

Is there something in place to make sure that pruning uses an up-to-date relindxvacxlogid/off value? I guess it doesn't matter if it's out-of-date, you'll just miss the opportunity to remove some dead tuples.

Seems odd to store relindxvacxlogid/off as two int32 columns. Store it in one uint64 column, or invent a new datatype for LSNs, or store it as text in %X/%X format.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to