The following bug has been logged online: Bug reference: 4730 Logged by: Wayne Conrad Email address: wcon...@yagni.com PostgreSQL version: 8.3.6 Operating system: Debian 4.3.1-1 Description: Vacuum full verbose analyze "deadlock" Details:
"VACUUM FULL ANALYZE VERBOSE" on a "deadlocks" with other SQL statements. No deadlock detection messages appear in the Postgres log, which is why I have put "deadlock" in quotes: It's a deadlock in the computer science meaning of the word, but perhaps not in the Postgres internal meaning of the word. Specifically, the VACUUM appears to be waiting on SQL statements which are in turn waiting on the VACUUM. That condition continues indefinitely until the VACUUM is manually terminated (a control-C in psql), at which time all locks appear to be released and processing continues normally. A query of pg_locks shows the condition. We've written a program that queries pg_locks and shows who is waiting on whom. here is a snippet of its output for one of these incidents. If a raw query of that table (or some other) would be better, please let me know what that query should be. ---------- Lock: locktype=relation relation=jobs granted=false transaction=4/900 pid=10247 mode=AccessExclusiveLock query = vacuum full verbose analyze; Dependants of that lock: locktype=relation relation=jobs granted=false transaction=22/654227 pid=3321 mode=AccessShareLock query = select id, to_char(time_stamp, 'YYYY-MM-DD HH24:MI:SS') as time_stamp, injectd_log.log_pid as log_pid, injectd_host, injectd_pid, line from injectd_log where log_pid in (select jobs.pid from jobs inner join uploads using (upload_id) where account_id = E'1000-000' union (select pid from uploads where account_id = E'1000-000')) and id > 27398675 order by id locktype=relation relation=jobs granted=false transaction=15/1241550 pid=32466 mode=AccessShareLock query = select w.id as id, w.created_at as created_at, w.status as status, w.timer as timer, w.error as error, w.priority as priority, w.work_type_id as work_type_id, w.upload_id as upload_id, w.job_id as job_id, w.batch_command_id as batch_command_id, w.mail_id as mail_id, w.print_queue_id as print_queue_id from work_queue_available inner join work_queue_data w on (work_queue_available.id = w.id) left outer join batch_commands on (batch_command_id = batch_commands.id) left outer join mail on (mail_id = mail.id) left outer join print_queue on (print_queue_id = print_queue.id) left outer join uploads on (w.upload_id = uploads.upload_id) left outer join jobs on (w.job_id = jobs.job_id) order by w.priority desc, case when w.work_type_id = (select id from work_types where name = E'cron') then 1 when w.work_type_id = (select id from work_types where name = E'email') then 2 else 3 end, coalesce(batch_commands.created_at, mail.created_at, print_queue.created_at, uploads.log_time, jobs.start_time), id limit 28 locktype=relation relation=jobs granted=true transaction=7/1931437 pid=27584 mode=AccessShareLock query = <IDLE> in transaction locktype=relation relation=jobs granted=true transaction=8/2113080 pid=31961 mode=AccessShareLock query = <IDLE> in transaction ---------- Lock: locktype=relation relation=jobs granted=false transaction=22/654227 pid=3321 mode=AccessShareLock query = select id, to_char(time_stamp, 'YYYY-MM-DD HH24:MI:SS') as time_stamp, injectd_log.log_pid as log_pid, injectd_host, injectd_pid, line from injectd_log where log_pid in (select jobs.pid from jobs inner join uploads using (upload_id) where account_id = E'1000-000' union (select pid from uploads where account_id = E'1000-000')) and id > 27398675 order by id Dependants of that lock: locktype=relation relation=jobs granted=false transaction=4/900 pid=10247 mode=AccessExclusiveLock query = vacuum full verbose analyze; locktype=relation relation=jobs granted=false transaction=15/1241550 pid=32466 mode=AccessShareLock query = select w.id as id, w.created_at as created_at, w.status as status, w.timer as timer, w.error as error, w.priority as priority, w.work_type_id as work_type_id, w.upload_id as upload_id, w.job_id as job_id, w.batch_command_id as batch_command_id, w.mail_id as mail_id, w.print_queue_id as print_queue_id from work_queue_available inner join work_queue_data w on (work_queue_available.id = w.id) left outer join batch_commands on (batch_command_id = batch_commands.id) left outer join mail on (mail_id = mail.id) left outer join print_queue on (print_queue_id = print_queue.id) left outer join uploads on (w.upload_id = uploads.upload_id) left outer join jobs on (w.job_id = jobs.job_id) order by w.priority desc, case when w.work_type_id = (select id from work_types where name = E'cron') then 1 when w.work_type_id = (select id from work_types where name = E'email') then 2 else 3 end, coalesce(batch_commands.created_at, mail.created_at, print_queue.created_at, uploads.log_time, jobs.start_time), id limit 28 locktype=relation relation=jobs granted=true transaction=7/1931437 pid=27584 mode=AccessShareLock query = <IDLE> in transaction locktype=relation relation=jobs granted=true transaction=8/2113080 pid=31961 mode=AccessShareLock query = <IDLE> in transaction ---------- Lock: locktype=relation relation=jobs granted=false transaction=15/1241550 pid=32466 mode=AccessShareLock query = select w.id as id, w.created_at as created_at, w.status as status, w.timer as timer, w.error as error, w.priority as priority, w.work_type_id as work_type_id, w.upload_id as upload_id, w.job_id as job_id, w.batch_command_id as batch_command_id, w.mail_id as mail_id, w.print_queue_id as print_queue_id from work_queue_available inner join work_queue_data w on (work_queue_available.id = w.id) left outer join batch_commands on (batch_command_id = batch_commands.id) left outer join mail on (mail_id = mail.id) left outer join print_queue on (print_queue_id = print_queue.id) left outer join uploads on (w.upload_id = uploads.upload_id) left outer join jobs on (w.job_id = jobs.job_id) order by w.priority desc, case when w.work_type_id = (select id from work_types where name = E'cron') then 1 when w.work_type_id = (select id from work_types where name = E'email') then 2 else 3 end, coalesce(batch_commands.created_at, mail.created_at, print_queue.created_at, uploads.log_time, jobs.start_time), id limit 28 Dependants of that lock: locktype=relation relation=jobs granted=false transaction=4/900 pid=10247 mode=AccessExclusiveLock query = vacuum full verbose analyze; locktype=relation relation=jobs granted=false transaction=22/654227 pid=3321 mode=AccessShareLock query = select id, to_char(time_stamp, 'YYYY-MM-DD HH24:MI:SS') as time_stamp, injectd_log.log_pid as log_pid, injectd_host, injectd_pid, line from injectd_log where log_pid in (select jobs.pid from jobs inner join uploads using (upload_id) where account_id = E'1000-000' union (select pid from uploads where account_id = E'1000-000')) and id > 27398675 order by id locktype=relation relation=jobs granted=true transaction=7/1931437 pid=27584 mode=AccessShareLock query = <IDLE> in transaction locktype=relation relation=jobs granted=true transaction=8/2113080 pid=31961 mode=AccessShareLock query = <IDLE> in transaction To summarize the above output: there are three statements, each waiting on the other two: transaction 4/900: VACUUM FULL VERBOSE ANALYZE transaction 22/654277: SELECT ... FROM INJECTD_LOG ... transaction 15/1241550: SELECT ... FROM WORK_QUEUE_AVAILABLE ... A "VACUUM VERBOSE ANALYZE" (without the "FULL") does not manifest this anomaly. Furthermore, after doing a partial "VACUUM VERBOSE ANALYZE" (that is, aborting it after it has gotten finished with some tables, but before it has done all of them), a "VACUUM FULL VERBOSE ANALYZE" gets farther (though more tables) before the lockup occurs. Specifically, the "VACUUM FULL VERBOSE ANALYZE" does not get locked up until it hits a table that the "VACUUM VERBOSE ANALYZE" did not get to. Sadly, I have only made this happen in our production environment (very active, very large set of data). No tidy test case wrapped up in a tarball with ribbons and bows. I am submitting this bug report despite the clear instructions to only submit bug reports against the latest minor version, since I'm currently at the end of my rope (until I manage to reproduce this in a non-production environment). I am hoping I can be forgiven for being one minor version behind. If not, let me know--I've got a nice corner right here to go stand in. -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs