On Sat, 2 Mar 2019 at 12:13, Tom Lane <t...@sss.pgh.pa.us> wrote:
>
> "Li, Zheng" <zhe...@amazon.com> writes:
> > Although adding "or var is NULL" to the anti join condition forces the 
> > planner to choose nested loop anti join, it is always faster compared to 
> > the original plan.
>
> TBH, I am *really* skeptical of sweeping claims like that.  The existing
> code will typically produce a hashed-subplan plan, which ought not be
> that awful as long as the subquery result doesn't blow out memory.
> It certainly is going to beat a naive nested loop.

It's pretty easy to show the claim is false using master and NOT EXISTS.

create table small(a int not null);
create table big (a int not null);
insert into small select generate_Series(1,1000);
insert into big select x%1000+1 from generate_Series(1,1000000) x;

select count(*) from big b where not exists(select 1 from small s
where s.a = b.a);
Time: 178.575 ms

select count(*) from big b where not exists(select 1 from small s
where s.a = b.a or s.a is null);
Time: 38049.969 ms (00:38.050)


-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Reply via email to