On 08/03/11 12:55, Mark Kirkwood wrote:
On 23/02/11 10:18, Mark Kirkwood wrote:
On 23/02/11 00:26, Greg Stark wrote:
It's also possible there's a bug of course. If someone was using that
buffer and somehow failed to notify the vacuum that they were done it
would wait for a very long time (forever?). However if vacuum
eventually continued when the query was canceled then it seems likely
it was working as intended.
Greg, thanks for clarifying this.
Unfortunately this time around I canceled the vacuum and then the
query. However *next* time I'll get rid of the query 1st and see what
happens.
I have another one here:
postgres=# select datname,procpid, query_start, current_query
from pg_stat_activity
where now() - query_start > '2 hours'::interval
order by query_start;
datname | procpid | query_start | current_query
---------+---------+-------------------------------+-----------------
stuff | 23957 | 2011-03-08 06:02:02.659159+13 | VACUUM ANALYZE;
postgres=# select datname,procpid, query_start, current_query
from pg_stat_activity
where now() - backend_start > '2 hours'::interval
order by query_start;
datname | procpid | query_start | current_query
---------+---------+-------------------------------+-----------------
stuff | 23957 | 2011-03-08 06:02:02.659159+13 | VACUUM ANALYZE;
:
stuff | 30018 | 2011-03-08 11:57:03.214177+13 | <IDLE>
stuff | 11390 | 2011-03-08 12:27:20.390122+13 | <IDLE>
stuff | 10178 | 2011-03-08 12:28:59.328265+13 | <IDLE>
stuff | 5100 | 2011-03-08 12:30:10.318204+13 | <IDLE>
stuff | 10255 | 2011-03-08 12:32:21.520814+13 | <IDLE>
stuff | 29993 | 2011-03-08 12:32:26.562835+13 | <IDLE>
stuff | 10204 | 2011-03-08 12:40:57.059958+13 | <IDLE>
stuff | 10206 | 2011-03-08 12:41:11.708924+13 | <IDLE>
(9 rows)
It looks to me like whatever query buffer pins the vacuum *was*
waiting on should be well and truly over by now. The strace for the
vacuum looks the same as before:
$ strace -p 23957
Process 23957 attached - interrupt to quit
semop(39747613, 0x7fff53d3dec0, 1
It certainly looks like the vacuum is not able to resume after setting
itself to sleep for some reason.
Immediately after I sent this, I saw the vacuum unlock itself and
continue... so there *are* some queries still blocking it. I'm seeing
some highish (1 -2 minute) lock waits for a certain table:
postgres=# select datname,procpid, query_start, current_query
from pg_stat_activity where now() - query_start > '1 minute'
and procpid in (select distinct pid from pg_locks where not
granted) order by query_start;
datname | procpid | query_start | current_query
---------+---------+-------------------------------+------------------------------
stuff | 390 | 2011-03-08 12:57:01.752234+13 | UPDATE
content.twitter_queue
: SET error = $1
: WHERE
twitter_queue_id = $2
(30 or so)
which are probably not helping.
So false alarm, no bug here it would seem, sorry.
Mark
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs