Re: Volatile Functions in Parallel Plans

Amit Kapila Wed, 15 Jul 2020 21:09:01 -0700

On Wed, Jul 15, 2020 at 6:14 PM Zhenghua Lyu <[email protected]> wrote:
>
>
> The first plan:
>
>  Finalize Aggregate
>    ->  Gather
>          Workers Planned: 2
>          ->  Partial Aggregate
>                ->  Nested Loop
>                      Join Filter: (t3.c1 = t4.c1)
>                      ->  Parallel Seq Scan on t3
>                            Filter: (c1 ~~ '%sss'::text)
>                      ->  Seq Scan on t4
>                            Filter: (timeofday() = c1)
>
> The join's left tree is parallel scan and the right tree is seq scan.
> This algorithm is correct using the distribute distributive law of
> distributed join:
>        A = [A1 A2 A3...An], B then A join B = gather( (A1 join B) (A2 join B) 
> ... (An join B) )
>
> The correctness of the above law should have a pre-assumption:
>       The data set of B is the same in each join: (A1 join B) (A2 join B) ... 
> (An join B)
>
> But things get complicated when volatile functions come in. Timeofday is just
> an example to show the idea. The core is volatile functions  can return 
> different
> results on successive calls with the same arguments. Thus the following piece,
> the right tree of the join
>                      ->  Seq Scan on t4
>                            Filter: (timeofday() = c1)
> can not be considered consistent everywhere in the scan workers.
>


But this won't be consistent even for non-parallel plans.  I mean to
say for each loop of join the "Seq Scan on t4" would give different
results.  Currently, we don't consider volatile functions as
parallel-safe by default.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Volatile Functions in Parallel Plans

Reply via email to