> -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of > Steve Crawford > Sent: Monday, July 26, 2004 1:23 PM > To: [EMAIL PROTECTED] > Subject: [GENERAL] vacuumdb hanging database cluster > > > When I run: > vacuumdb --full --all --analyze --quiet > on my database cluster it will complete in < 2 minutes (this cluster > is a few million total rows and ~2GB). > > After testing, I set this up as an off-hours cron job and it worked > fine for several days then hung the whole database. After my pager > pulled me from bed I found the vacuumdb process still running but the > vacuum process on the first database (alphabetically) was showing it > was waiting: > postgres: vacuumdb --full --all --analyze --quiet > postgres: postgres firstdb [local] VACUUM waiting > > A couple hundred processes were showing as "startup waiting" and one > was "idle in transaction". The process in the "VACUUM waiting" state > was the only one connected to that database - all other connections > were to other databases. > > CPU and disk utilization were essentially zero. Suspecting a lock > problem I attempted to use a pre-existing connection to view pg_locks > but it would not respond. > > I killed the vacuum process and all the processes in the "waiting" > states cleared within a second or two and system returned to normal. > The pg_locks query also returned but showed no useful info. > > I tracked down the process that was "idle in transaction" and > it was a > pg_dump process running on another machine. This process does a > periodic dump of one very small table and should complete in a > fraction of a second but was still waiting since the previous day - > apparently without deleterious effects. > > There was no useful info in the log. > > I've stopped running the vacuum full job via cron till I can > trust it. > Any ideas on how to track/prevent this behavior? Server is version > 7.4.1 and my web searches have proved futile.
I have seen problematic behavior when one vacuum starts after another is already running. It might be a good idea to semaphore vacuum operations. But my experience is with an older version of PostgreSQL, so the problems you are seeing might be totally unrelated. ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster
