On 09/19/2012 01:48 PM, l...@tom.com wrote:
The following bug has been logged on the website:

Bug reference:      7556
Logged by:          lt
Email address:      l...@tom.com
PostgreSQL version: 9.2.0
Operating system:   windows xp
Description:


create table sli_test (id int primary key,info varchar(20));
insert into sli_test select
generate_series(1,1000000),'digoal'||generate_series(1,1000000);
analyze verbose sli_test;
create table sli_test2 (id int not null,info varchar(20));
insert into sli_test2 select
generate_series(1,1000000),'dbase'||generate_series(1,1000000);
analyze verbose sli_test2;

explain select max(a.info)from sli_test a where a.id not in(select
b.id from sli_test2 b where b.id<50000);

                                       QUERY PLAN
---------------------------------------------------------------------------------------
  Aggregate  (cost=9241443774.00..9241443774.01 rows=1 width=12)

Here's what I get on 9.1:

regress=# explain select max(a.info)from sli_test a where a.id not in(select
regress(# b.id from sli_test2 b where b.id<50000);
QUERY PLAN
---------------------------------------------------------------------------------
 Aggregate  (cost=38050.82..38050.83 rows=1 width=12)
-> Seq Scan on sli_test a (cost=18026.82..36800.82 rows=500000 width=12)
         Filter: (NOT (hashed SubPlan 1))
         SubPlan 1
-> Seq Scan on sli_test2 b (cost=0.00..17906.00 rows=48329 width=4)
                 Filter: (id < 50000)
(6 rows)


It runs in about 500ms here.

You don't appear to have posted the full query plan, so it's hard to compare.

In general, `NOT IN` is a poor formulation for a query; you're better off with a JOIN or with `NOT EXISTS`. See eg

http://stackoverflow.com/questions/12444142/postgresql-how-to-figure-out-missing-numbers-in-a-column-using-generate-series/12444165#12444165

--
Craig Ringer


--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Reply via email to