Hi all, consider the following SQL:
================================================================================================ gpadmin=# explain (verbose, costs off) select * from t, (select a from generate_series(1, 1)a)x, (select a from generate_series(1, 1)a)y where ((x.a+y.a)/4.0) > random(); QUERY PLAN ---------------------------------------------------------------------------------------- Nested Loop Output: t.a, t.b, a.a, a_1.a -> Nested Loop Output: a.a, a_1.a Join Filter: (((((a.a + a_1.a))::numeric / 4.0))::double precision > random()) -> Function Scan on pg_catalog.generate_series a Output: a.a Function Call: generate_series(1, 1) -> Function Scan on pg_catalog.generate_series a_1 Output: a_1.a Function Call: generate_series(1, 1) -> Seq Scan on public.t Output: t.a, t.b (13 rows) ================================================================================================ The where clause is "pushed down to the x,y" because it only references these two relations. The original query tree's join tree is like: FromExpr [] [fromlist] RangeTblRef [rtindex=1] RangeTblRef [rtindex=4] RangeTblRef [rtindex=5] [quals] OpExpr [opno=674 opfuncid=297 opresulttype=16 opretset=false] FuncExpr [funcid=1746 funcresulttype=701 funcretset=false funcvariadic=false funcformat=COERCE_IMPLICIT_CAST] OpExpr [opno=1761 opfuncid=1727 opresulttype=1700 opretset=false] FuncExpr [funcid=1740 funcresulttype=1700 funcretset=false funcvariadic=false funcformat=COERCE_IMPLICIT_CAST] OpExpr [opno=551 opfuncid=177 opresulttype=23 opretset=false] Var [varno=4 varattno=1 vartype=23 varnoold=4 varoattno=1] Var [varno=5 varattno=1 vartype=23 varnoold=5 varoattno=1] Const [consttype=1700 constlen=-1 constvalue=94908966309104 constisnull=false constbyval=false] FuncExpr [funcid=1598 funcresulttype=701 funcretset=false funcvariadic=false funcformat=COERCE_EXPLICIT_CALL] It seems the semantics it wants to express is: filter after join all the tables. Thus maybe a plan like Nested Loop Join Filter: (((((a.a + a_1.a))::numeric / 4.0))::double precision > random()) -> Nested Loop -> Function Scan on generate_series a -> Function Scan on generate_series a_1 -> Seq Scan on t (cost=0.00..32.60 rows=2260 width=8) May also be reasonable because it is just the direct translation from the original query tree. The above plans may have different property: * the first one, if we push down, can only produce 2 results: 0 rows, or 10 rows. No third possibility * the second one, will output 0 ~ 10 rows with equal probability. I am wondering if we should consider volatile functions in restrictinfo when try to distribute_restrictinfo_to_rels? Best, Zhenghua Lyu