On Mar 3, 2010, at 18:01 , Josh Kupershmidt wrote:

> Though next time you see a query which doesn't respond to 
> pg_cancel_backend(), try gathering information about the query and what the 
> backend is doing; either you're doing something unusual (e.g. an app is 
> restarting the query automatically after getting canceled) or perhaps you've 
> stumbled on a bug in Postgres.

Hi. A long time has passed since you made that suggestion, but today we 
stumbled again on a query that wouldn't be canceled. Not only does it not 
respond to pg_cancel_backend(), it also doesn't respond to kill -SIGTERM.

The query is:

select 
date_trunc('day',rb.time_stamp),count(*),rb.category,channels.channel_id,channels.name
 as channel,platforms.platform_id,
platforms.name,rb.operator,item,delivered,msisdn

from public.rb__view as rb,channels,platforms
where
rb.channel_id=channels.channel_id
and rb.platform_id=platforms.platform_id

and rb.time_stamp>='2010-03-14'::date  and rb.time_stamp<'2010-03-14'::date  + 
interval '1 day'
and platforms.platform_id=262
and channels.channel_id=1

group by 
date_trunc('day',rb.time_stamp),rb.category,channels.channel_id,channel,operator,item,delivered,msisdn,platforms.platform_id,
platforms.name

This is nothing too fancy - just an aggregate with group by.  And the 
application on the other side is Crystal Reports, connecting using ODBC. I 
don't believe the application does anything like restart after cancel, because 
most of our queries can easily be cancelled and I don't think Crystal has 
different behaviors for different queries.

rb__view is a union all between two tables (rb and rb__archive) which have the 
same schema - one holds data from the past 7 weeks and the other holds older 
data. The channels and platforms tables are basically lookup tables. The fields 
item,delivered and msisdn all belong to rb__view.

There is nothing in the PostgreSQL log.

If it helps any, this is the EXPLAIN output for the above query. Note that at 
this time, the query has been running for over a hour and a half.

HashAggregate  (cost=221312.77..221318.08 rows=354 width=94)
  ->  Nested Loop  (cost=8078.83..221215.50 rows=3537 width=94)
        ->  Seq Scan on channels  (cost=0.00..3.81 rows=1 width=16)
              Filter: (channel_id = 1::numeric)
        ->  Nested Loop  (cost=8078.83..221167.48 rows=3537 width=85)
              ->  Index Scan using platforms_pkey on platforms  
(cost=0.00..6.27 rows=1 width=19)
                    Index Cond: (platform_id = 262::numeric)
              ->  Append  (cost=8078.83..221125.84 rows=3537 width=73)
                    ->  Bitmap Heap Scan on rb  (cost=8078.83..221115.42 
rows=3536 width=72)
                          Recheck Cond: ((public.rb.time_stamp >= 
'2010-03-14'::date) AND (public.rb.time_stamp < '2010-03-15 
00:00:00'::timestamp without time zone))
                          Filter: ((public.rb.channel_id = 1::numeric) AND 
(public.rb.platform_id = 262::numeric))
                          ->  Bitmap Index Scan on rb_timestamp_ind  
(cost=0.00..8077.94 rows=104502 width=0)
                                Index Cond: ((public.rb.time_stamp >= 
'2010-03-14'::date) AND (public.rb.time_stamp < '2010-03-15 
00:00:00'::timestamp without time zone))
                    ->  Index Scan using rba_timestamp_ind on rb__archive  
(cost=0.00..10.42 rows=1 width=73)
                          Index Cond: ((rb__archive.time_stamp >= 
'2010-03-14'::date) AND (rb__archive.time_stamp < '2010-03-15 
00:00:00'::timestamp without time zone))
                          Filter: ((rb__archive.channel_id = 1::numeric) AND 
(rb__archive.platform_id = 262::numeric))

I don't know what other information I may glean for this. Any thoughts?

Thank you,
Herouth

Reply via email to