Re: [BUGS] Hung Vacuum in 8.3

Mark Kirkwood Mon, 07 Mar 2011 19:15:05 -0800

On 08/03/11 13:03, Mark Kirkwood wrote:

On 08/03/11 12:55, Mark Kirkwood wrote:
On 23/02/11 10:18, Mark Kirkwood wrote:
On 23/02/11 00:26, Greg Stark wrote:
It's also possible there's a bug of course. If someone was using that
buffer and somehow failed to notify the vacuum that they were done it
would wait for a very long time (forever?). However if vacuum
eventually continued when the query was canceled then it seems likely
it was working as intended.
Greg, thanks for clarifying this.
Unfortunately this time around I canceled the vacuum and then thequery. However *next* time I'll get rid of the query 1st and seewhat happens.
I have another one here:

postgres=# select datname,procpid, query_start, current_query
                  from pg_stat_activity
where now() - query_start > '2 hours'::intervalorder by query_start;
 datname | procpid |          query_start          |  current_query
---------+---------+-------------------------------+-----------------
 stuff   |   23957 | 2011-03-08 06:02:02.659159+13 | VACUUM ANALYZE;

postgres=# select datname,procpid, query_start, current_query
                  from pg_stat_activity
where now() - backend_start > '2 hours'::intervalorder by query_start;
 datname | procpid |          query_start          |  current_query
---------+---------+-------------------------------+-----------------
 stuff   |   23957 | 2011-03-08 06:02:02.659159+13 | VACUUM ANALYZE;
                                                   :
 stuff   |   30018 | 2011-03-08 11:57:03.214177+13 | <IDLE>
 stuff   |   11390 | 2011-03-08 12:27:20.390122+13 | <IDLE>
 stuff   |   10178 | 2011-03-08 12:28:59.328265+13 | <IDLE>
 stuff   |    5100 | 2011-03-08 12:30:10.318204+13 | <IDLE>
 stuff   |   10255 | 2011-03-08 12:32:21.520814+13 | <IDLE>
 stuff   |   29993 | 2011-03-08 12:32:26.562835+13 | <IDLE>
 stuff   |   10204 | 2011-03-08 12:40:57.059958+13 | <IDLE>
 stuff   |   10206 | 2011-03-08 12:41:11.708924+13 | <IDLE>
(9 rows)
It looks to me like whatever query buffer pins the vacuum *was*waiting on should be well and truly over by now. The strace for thevacuum looks the same as before:
$ strace -p 23957
Process 23957 attached - interrupt to quit
semop(39747613, 0x7fff53d3dec0, 1
It certainly looks like the vacuum is not able to resume aftersetting itself to sleep for some reason.
Immediately after I sent this, I saw the vacuum unlock itself andcontinue... so there *are* some queries still blocking it. I'm seeingsome highish (1 -2 minute) lock waits for a certain table:
postgres=# select datname,procpid, query_start, current_query
           from pg_stat_activity where now() - query_start > '1 minute'
and procpid in (select distinct pid from pg_locks where notgranted) order by query_start;
 datname | procpid |          query_start          |        current_query
---------+---------+-------------------------------+------------------------------stuff | 390 | 2011-03-08 12:57:01.752234+13 | UPDATEcontent.twitter_queue
                                                   : SET error = $1
: WHEREtwitter_queue_id = $2
(30 or so)

which are probably not helping.


Make that definitely not helping:

stuff=# select relname, last_vacuum
        from pg_stat_all_tables
        where last_vacuum is not null
        and last_vacuum>  '2011-03-08 06:03:06'::timestamp
        order by last_vacuum limit 3;
    relname     |          last_vacuum
----------------+-------------------------------
 headline_alert | 2011-03-08 06:03:06.182866+13
 stock_static   | 2011-03-08 06:03:06.610739+13
 twitter_queue  | 2011-03-08 12:58:03.02451+13
(3 rows)

So we were trying to vacuum 'twitter_queue' (a small 20MB table) forapprox 6 hrs.

Note that the vacuum itself was not waiting in any locks according topg_locks, so looks to be the pinning issue. Clearly we need to amend theapplication design that results in this quite long (1-2 minute)! waitfor the lock on this table.


regards

Mark

P.s: for the interested, the process(es) that the above UPDATEs arewaiting for are sitting '<IDLE> in transaction' - i.e application end isdoing some *non database* blocking call whilst holding a lock on the'twitter_queue' table :-/


--
Sent via pgsql-bugs mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] Hung Vacuum in 8.3

Reply via email to