> > However the thing becomes complex with the below 2 cases. > > 1. SELECT * FROM t INNER JOIN j on t.nullable = q.b; > We know t.b will be not null **finally**. But the current plan may > something > like this: > > QUERY PLAN > ------------------------------------------ > Merge Join > Merge Cond: (t.nullable = j.something) > -> Sort > Sort Key: t.nullable > -> Seq Scan on t > -> Sort > Sort Key: j.something > -> Seq Scan on j > (8 rows) > > which means the Path "Seq Scan on t" still contains some null values. At > least, > we should not assume t.nullable is "not nullable" the base relation stage. > > 2. SELECT t.a FROM j LEFT JOIN t ON t.b = t.a; > Even the t.a is not null by definition, but it may have null **finally** > due to > the outer join. >
The above 2 cases have been addressed by defining the notnullattrs on every RelOptInfo, and maintaining them on every join. However, per offline discussion with David, IIUC, there is a more case to think about. CREATE TABLE t (a INT, b INT); SELECT * FROM t WHERE a = 1 and b = 2; We know b is not null after we evaluate the qual b = 2, but it may still nullable when we just evaluate a = 1; I prefer to not handle it by saying the semantics of notnullattrs is correct after we evaluate all the quals on its RelOptInfo. > It would be good to agree on the correct representation for Vars that >> cannot produce NULLs so that we don't shut the door on classes of >> optimisation that require something other than what you need for your >> case. >> >> > Looks we have to maintain not null on the general RelOptInfo level rather > than Base > RelOptInfo. But I don't want to teach Var about the notnull so far. The > reasons are: 1). > We need to maintain the Planner version and Parser version due to the VIEW > case. > 2). We have to ignore the extra part for equal(Var, Var) . 3). Var is > usually shared among > different RelOptInfo. which means we have to maintain different copies for > this purpose IIUC. > > I assume we want to know if a Var is nullable with a function like. > is_var_notnullable(Var *var, Relids relids). If so, we can define the > data as below: > > struct RelOptInfo { > > Bitmapset** notnullattrs; > .. > }; > > After this we can implement the function as: > /* * is_var_notnullable * Check if the var is nullable for a given RelOptIno after * all the quals on it have been evaluated. * * var is the var to check, relids is the ids of a RelOptInfo * we will check on. */ bool is_var_notnullable(Var* var, Relids relids) { RelOptInfo *rel = find_rel_by_relids(reldis); return bms_is_member(var->varattno, rel->notnullattrs[var->varno]); } Do you think this is a reasonable solution? > bool > is_var_notnullable(Var* var, Relids relids) > { > RelOptInfo *rel = find_rel_by_relids(reldis); > return bms_is_member(var->varattno, rel->notnullattrs[var->varno]); > } > > Probably we can make some hackers to reduce the notnullattrs's memory usage > overhead. > > -- Best Regards Andy Fan (https://www.aliyun.com/)