Hi, I test some SQL in the latest Postgres master branch code (we find these issues when developing Greenplum database in the PR https://github.com/greenplum-db/gpdb/pull/10418, and my colleague come up with the following cases in Postgres):
create table t3 (c1 text, c2 text); CREATE TABLE insert into t3 select 'fhufehwiohewiuewhuhwiufhwifhweuhfwu', --random data 'fiowehufwhfyegygfewpfwwfeuhwhufwh' --random data from generate_series(1, 10000000) i; INSERT 0 10000000 analyze t3; ANALYZE create table t4 (like t3); CREATE TABLE insert into t4 select * from t4; INSERT 0 0 insert into t4 select * from t3; INSERT 0 10000000 analyze t4; ANALYZE set enable_hashjoin to off; SET explain (costs off) select count(*) from t3, t4 where t3.c1 like '%sss' and timeofday() = t4.c1 and t3.c1 = t4.c1; QUERY PLAN -------------------------------------------------------- Finalize Aggregate -> Gather Workers Planned: 2 -> Partial Aggregate -> Nested Loop Join Filter: (t3.c1 = t4.c1) -> Parallel Seq Scan on t3 Filter: (c1 ~~ '%sss'::text) -> Seq Scan on t4 Filter: (timeofday() = c1) (10 rows) explain (verbose, costs off) select count(*) from t3, (select *, timeofday() as x from t4 ) t4 where t3.c1 like '%sss' and timeofday() = t4.c1 and t3.c1 = t4.c1; QUERY PLAN ------------------------------------------------------------------ Finalize Aggregate Output: count(*) -> Gather Output: (PARTIAL count(*)) Workers Planned: 2 -> Partial Aggregate Output: PARTIAL count(*) -> Nested Loop Join Filter: (t3.c1 = t4.c1) -> Parallel Seq Scan on public.t3 Output: t3.c1, t3.c2 Filter: (t3.c1 ~~ '%sss'::text) -> Seq Scan on public.t4 Output: t4.c1, NULL::text, timeofday() Filter: (timeofday() = t4.c1) (15 rows) Focus on the last two plans, the function timeofday is volatile but paralle-safe. And Postgres outputs two parallel plan. The first plan: Finalize Aggregate -> Gather Workers Planned: 2 -> Partial Aggregate -> Nested Loop Join Filter: (t3.c1 = t4.c1) -> Parallel Seq Scan on t3 Filter: (c1 ~~ '%sss'::text) -> Seq Scan on t4 Filter: (timeofday() = c1) The join's left tree is parallel scan and the right tree is seq scan. This algorithm is correct using the distribute distributive law of distributed join: A = [A1 A2 A3...An], B then A join B = gather( (A1 join B) (A2 join B) ... (An join B) ) The correctness of the above law should have a pre-assumption: The data set of B is the same in each join: (A1 join B) (A2 join B) ... (An join B) But things get complicated when volatile functions come in. Timeofday is just an example to show the idea. The core is volatile functions can return different results on successive calls with the same arguments. Thus the following piece, the right tree of the join -> Seq Scan on t4 Filter: (timeofday() = c1) can not be considered consistent everywhere in the scan workers. The second plan Finalize Aggregate Output: count(*) -> Gather Output: (PARTIAL count(*)) Workers Planned: 2 -> Partial Aggregate Output: PARTIAL count(*) -> Nested Loop Join Filter: (t3.c1 = t4.c1) -> Parallel Seq Scan on public.t3 Output: t3.c1, t3.c2 Filter: (t3.c1 ~~ '%sss'::text) -> Seq Scan on public.t4 Output: t4.c1, NULL::text, timeofday() Filter: (timeofday() = t4.c1) have voltile projections in the right tree of the nestloop: -> Seq Scan on public.t4 Output: t4.c1, NULL::text, timeofday() Filter: (timeofday() = t4.c1) It should not be taken as consistent in different workers. ------------------------------------------------------------------------------------------ The above are just two cases we find today. And it should be enough to show the core issue to have a discussion here. The question is, should we consider volatile functions when generating parallel plans? ------------------------------------------------------------------------------------------ FYI, some plan diffs of Greenplum can be found here: https://www.diffnow.com/report/etulf