Greenplum Database does this optimization. The idea is to use a new join type, let's call it JOIN_LASJ_NOTIN, and its semantic regarding NULL is defined as below:
1. If there is a NULL in the outer side, and the inner side is empty, the NULL should be part of the outputs. 2. If there is a NULL in the outer side, and the inner side is not empty, the NULL should not be part of the outputs. 3. If there is a NULL in the inner side, no outputs should be produced. An example plan looks like: gpadmin=# explain (costs off) select * from t1 where a not in(select a from t2); QUERY PLAN ----------------------------------- Hash Left Anti Semi (Not-In) Join Hash Cond: (t1.a = t2.a) -> Seq Scan on t1 -> Hash -> Seq Scan on t2 (5 rows) Thanks Richard On Tue, Feb 26, 2019 at 7:20 AM David Rowley <david.row...@2ndquadrant.com> wrote: > On Tue, 26 Feb 2019 at 11:51, Li, Zheng <zhe...@amazon.com> wrote: > > Resend the patch with a whitespace removed so that "git apply patch" > works directly. > > I had a quick look at this and it seems to be broken for the empty > table case I mentioned up thread. > > Quick example: > > Setup: > > create table t1 (a int); > create table t2 (a int not null); > insert into t1 values(NULL),(1),(2); > > select * from t1 where a not in(select a from t2); > > Patched: > a > --- > 1 > 2 > (2 rows) > > Master: > a > --- > > 1 > 2 > (3 rows) > > This will be due to the fact you're adding an a IS NOT NULL qual to > the scan of a: > > postgres=# explain select * from t1 where a not in(select a from t2); > QUERY PLAN > ------------------------------------------------------------------ > Hash Anti Join (cost=67.38..152.18 rows=1268 width=4) > Hash Cond: (t1.a = t2.a) > -> Seq Scan on t1 (cost=0.00..35.50 rows=2537 width=4) > Filter: (a IS NOT NULL) > -> Hash (cost=35.50..35.50 rows=2550 width=4) > -> Seq Scan on t2 (cost=0.00..35.50 rows=2550 width=4) > (6 rows) > > but as I mentioned, you can't do that as t2 might be empty and there's > no way to know that during planning. > > -- > David Rowley > https://urldefense.proofpoint.com/v2/url?u=http-3A__www.2ndQuadrant.com_&d=DwIBaQ&c=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw&r=5r3cnfZPUDOHrMiXq8Mq2g&m=dE1nglE17x3nD-oH_BrF0r4SLaFnQKzwwJBJGpDoaaA&s=dshupMomMvkDAd92918cU21AJ1E1s7QwbrxIGSRxZA8&e= > PostgreSQL Development, 24x7 Support, Training & Services > >