Re: fix cost subqueryscan wrong parallel cost

Richard Guo Fri, 15 Apr 2022 02:17:13 -0700

On Fri, Apr 15, 2022 at 12:50 AM Robert Haas <robertmh...@gmail.com> wrote:


> On Tue, Apr 12, 2022 at 2:57 AM bu...@sohu.com <bu...@sohu.com> wrote:
> > The cost_subqueryscan function does not judge whether it is parallel.
>
> I don't see any reason why it would need to do that. A subquery scan
> isn't parallel aware.
>
> > regress
> > -- Incremental sort vs. set operations with varno 0
> > set enable_hashagg to off;
> > explain (costs off) select * from t union select * from t order by 1,3;
> >                         QUERY PLAN
> > ----------------------------------------------------------
> >  Incremental Sort
> >    Sort Key: t.a, t.c
> >    Presorted Key: t.a
> >    ->  Unique
> >          ->  Sort
> >                Sort Key: t.a, t.b, t.c
> >                ->  Append
> >                      ->  Gather
> >                            Workers Planned: 2
> >                            ->  Parallel Seq Scan on t
> >                      ->  Gather
> >                            Workers Planned: 2
> >                            ->  Parallel Seq Scan on t t_1
> > to
> >  Incremental Sort
> >    Sort Key: t.a, t.c
> >    Presorted Key: t.a
> >    ->  Unique
> >          ->  Sort
> >                Sort Key: t.a, t.b, t.c
> >                ->  Gather
> >                      Workers Planned: 2
> >                      ->  Parallel Append
> >                            ->  Parallel Seq Scan on t
> >                            ->  Parallel Seq Scan on t t_1
> > Obviously the latter is less expensive
>
> Generally it should be. But there's no subquery scan visible here.
>

The paths of subtrees in set operations would be type of subqueryscan.
The SubqueryScan nodes are removed later in set_plan_references() in
this case as they are considered as being trivial.


>
> There may well be something wrong here, but I don't think that you've
> diagnosed the problem correctly, or explained it clearly.
>

Some debugging work shows that the second path is generated but then
fails when competing with the first path. So if there is something
wrong, I think cost calculation is the suspicious point.

Not related to this topic but I noticed another problem from the plan.
Note the first Sort node which is to unique-ify the result of the UNION.
Why cannot we re-arrange the sort keys from (a, b, c) to (a, c, b) so
that we can avoid the second Sort node?

Thanks
Richard

Re: fix cost subqueryscan wrong parallel cost

Reply via email to