On Tue, Feb 22, 2011 at 7:14 AM, Mark Kirkwood <mark.kirkw...@catalyst.net.nz> wrote: > On 22/02/11 19:47, Heikki Linnakangas wrote: >> >> A long query on the same table can block vacuum. Vacuum needs to take a >> so-called "cleanup lock" on each page, which means that it has to wait until >> no other backend holds a pin on the page. A long-running query can keep a >> page pinned for a long time. >> > > Ah, - this is LockBufferForCleanup? So we are waiting for an exclusive > LockBuffer operation i.e a LWLock not a (heavyweight) lock, urg... no > wonder.
Actually it's not waiting for the LockBuffer LWLock. it's waiting until your query unpins the buffer it wants. Vacuum tries to get an exclusive lock on the buffer, if it gets it then it checks if anyone is using that buffer. If someone is then it unlocks the buffer and waits until nobody has it pinned. Only certain plan types will keep a buffer pinned for a long time. Things like the outer table of a nested loop join where the inner side is another large table or slow subquery for example. This isn't terribly common, usually it's caused by a large mistaken cartesian join or something. but occasionally Postgres will generate a plan that could do it. For instance joining a single row against a large table will sometimes do a nested loop from the single row to the large table. It's also possible Vacuum has been making progress but the query keeps getting in its way and stopping it on new blocks. It's also possible there's a bug of course. If someone was using that buffer and somehow failed to notify the vacuum that they were done it would wait for a very long time (forever?). However if vacuum eventually continued when the query was canceled then it seems likely it was working as intended. -- greg -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs