Josh Berkus wrote:
I've been trying to diagnose in a production database why certain tables
never get autovacuumed despite having a substantial % of updates.  The
obvious reason is locks blocking autovacuum from vacuuming the table ...

Missed this dicussion when it popped up but have plenty to say about it now. What I do here is look for such anomolies using pg_stat_user_tables, that the dead rows number has exceeded the vacuum threshold. That comparison is logged in the code at DEBUG3:

elog(DEBUG3, "%s: vac: %.0f (threshold %.0f), anl: %.0f (threshold %.0f)",
            NameStr(classForm->relname),
            vactuples, vacthresh, anltuples, anlthresh);

But a rough computation isn't too hard to derive in a report, so long as you haven't customized per-table parameters. As you suggested in your message, the really bad cases here stick out a whole lot. If you pay the slightest amount of attention to the dead row percentages they jump right out at you. This all works easily on any version back to 8.3. Not having as much relevant data stored in pg_stat_user_tables makes the problem cases less obvious to spot in older versions.

If I start seeing these badly maintained tables and suspect locking is getting in the way, I then dump traces from pg_locks+pg_stat_activity often enough that I can estimate how often someone has an interfering lock and what they're doing.

Should the log level on this message go up from DEBUG3? I could see rewriting it so that it logs at DEBUG1 instead when Log_autovacuum_min_duration is set *and* when the trigger threshold is crossed, and at DEBUG3 the rest of the time. Given you can derive this with a bit of work in userland, I don't see this even being justified as an INFO or LOG level message. Anytime I can script a SQL-level monitor for something that's easy to tie into Nagios or something, I greatly prefer that to log file scraping for it anyway.

What I'd like to do is add some logging code to autovacuum.c so that if
log_autovacuum is any value other than -1, failure to vacuum due to
locks gets logged.   Does this make sense?

The general idea is interesting and probably more productive for the situation you theorize is happening then messing with the logging discussed above. But that's not where the code needs to go--the lock isn't opened until much further down the function call stack. Attached quickie and only tested for compilation patch probably does what you want here. Since this would eliminate the messy follow-up step I sometimes have gone through, dumping pg_locks data to confirm or rule out locking issues messing with AV processing, I can see some potential that it may have simplified situations I've ran into in the past. And it's not out of line with the logging verbosity of similar failure mode tests that follow it. Right now failure to acquire a lock is just not considered a log-worthy issue, and I agree that it's worth considering whether it should be. If you could gather more info on whether this logging catches the problem cases you're seeing, that would really be the right test for the patch's usefulness. I'd give you solid 50/50 odds that you've correctly diagnosed the issue, and knowing for sure would make advocating for this logging a pretty easy sell to me at least.

--
Greg Smith   2ndQuadrant US    g...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services and Support        www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 2f68df4..2124e25 100644
*** a/src/backend/commands/vacuum.c
--- b/src/backend/commands/vacuum.c
*************** vacuum_rel(Oid relid, VacuumStmt *vacstm
*** 851,856 ****
--- 851,864 ----
  	{
  		PopActiveSnapshot();
  		CommitTransactionCommand();
+ 
+ 		if (IsAutoVacuumWorkerProcess() && Log_autovacuum_min_duration >= 0)
+ 		{
+ 		ereport(INFO,
+ 			(errmsg("skipping \"%s\" --- cannot open or obtain lock",
+ 					RelationGetRelationName(onerel))));
+ 		}
+ 
  		return;
  	}
  
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to