On Sat, 2 Mar 2019 at 12:13, Tom Lane <t...@sss.pgh.pa.us> wrote: > > "Li, Zheng" <zhe...@amazon.com> writes: > > Although adding "or var is NULL" to the anti join condition forces the > > planner to choose nested loop anti join, it is always faster compared to > > the original plan. > > TBH, I am *really* skeptical of sweeping claims like that. The existing > code will typically produce a hashed-subplan plan, which ought not be > that awful as long as the subquery result doesn't blow out memory. > It certainly is going to beat a naive nested loop.
It's pretty easy to show the claim is false using master and NOT EXISTS. create table small(a int not null); create table big (a int not null); insert into small select generate_Series(1,1000); insert into big select x%1000+1 from generate_Series(1,1000000) x; select count(*) from big b where not exists(select 1 from small s where s.a = b.a); Time: 178.575 ms select count(*) from big b where not exists(select 1 from small s where s.a = b.a or s.a is null); Time: 38049.969 ms (00:38.050) -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services