Re: enable_incremental_sort changes query behavior

Tomas Vondra Wed, 07 Oct 2020 15:23:25 -0700

On Wed, Oct 07, 2020 at 04:01:27PM -0400, James Coleman wrote:

On Wed, Oct 7, 2020 at 3:52 PM Robert Haas <[email protected]> wrote:

On Thu, Oct 1, 2020 at 9:03 AM James Coleman <[email protected]> wrote:
> The plan (obtained by replacing the volatile function with a stable one):
>
>  Unique
>    ->  Nested Loop
>          ->  Gather Merge
>                Workers Planned: 2
>                ->  Sort
>                      Sort Key: ref_0.i, (md5(ref_0.t))
>                      ->  Parallel Index Scan using ref_0_i_idx on ref_0
>          ->  Function Scan on generate_series ref_1
>
> Changing `md5(ref_0.t)` to `random()::text || ref_0.t` causes the error.

As far as I remember this stuff, target lists for any nodes below the
top of the join tree were previously always just Var nodes. Any
expressions that needed to be computed were computed at the level of
the topmost join, before adding upper planner steps like Agg,
WindowAgg, Limit, etc. But, here you've got a complex expression --
md5(ref_0.t) -- begin computed somewhere in the middle of the plan
tree so that the sort can be performed at that point. However, that
seems likely to break a bunch of assumptions all over the planner,
because, unless a lot of work and surgery has been done since I looked
at this last, that md5(ref_0.t) value is not going to "bubble up" to
higher levels of the plan through the tlists of the individual nodes.
That poses a risk that it will be recomputed multiple times, which is
particularly bad for volatile functions because you might not always
get the same answer, but it seems like it might be bad even for
non-volatile functions, because it could be slow if the value is used
at multiple places in the query. It feels like the sort of thing that
might have broken some other assumptions, too, although I can't say
exactly what.


The distinction between volatile and stable expressions here has been
discussed at length in this thread; I think it'd be worth reading
through the rest of the thread before offering sledgehammer ideas
below like reverting the entire piece.


Um, that's a tad too harsh, I guess ...

We've discussed the volatile/stable expressions before, but AFAIK we
failed to notice the planner assumes such expressions are expected to be
computed only in the upper parts of the plan. If this assumption really
exists, it probably relies on simply calling create_sort_path just about
right, and the new code looking at useful pathkeys below gather merge
might be breaking that simply by doing it too early.

In that case we can either rip this code out, effectively reverting most
of ba3e76cc571, or consider only pathkeys that we can find in the tlist
of the child path.

I'm still not entirely sure I understand what's happening, or what the
exact rule is. Consider this query:

  explain (verbose) select distinct i, t, md5(t) from ref_0;

which on PG12 (i.e. before incremental sort) is planned like this:

                                 QUERY PLAN
----------------------------------------------------------------------------------
 Unique  (cost=78120.92..83120.92 rows=500000 width=65)
   Output: i, t, (md5(t))
   ->  Sort  (cost=78120.92..79370.92 rows=500000 width=65)
         Output: i, t, (md5(t))
         Sort Key: ref_0.i, ref_0.t, (md5(ref_0.t))
         ->  Seq Scan on public.ref_0  (cost=0.00..10282.00 rows=500000 
width=65)
               Output: i, t, md5(t)
(7 rows)

i.e. the (stable) function is pushed all the way to the scan node. And
even if we replace it with a volatile expression it gets pushed down:

explain (verbose) select distinct i, t, md5(random()::text || t) from ref_0;

                                 QUERY PLAN
----------------------------------------------------------------------------------
 Unique  (cost=83120.92..88120.92 rows=500000 width=65)
   Output: i, t, (md5(((random())::text || t)))
   ->  Sort  (cost=83120.92..84370.92 rows=500000 width=65)
         Output: i, t, (md5(((random())::text || t)))
         Sort Key: ref_0.i, ref_0.t, (md5(((random())::text || ref_0.t)))
         ->  Seq Scan on public.ref_0  (cost=0.00..15282.00 rows=500000 
width=65)
               Output: i, t, md5(((random())::text || t))
(7 rows)


But perhaps I just don't understand the assumption correctly?

In particular, most of the discussion in the thread has been about
what is actually safe to reference at lower levels in the plan. It
would be extremely easy tomake this only allow sorting by Vars at
lower levels in the plan, though the current revision of the patch
follows what prepare_sort_from_pathkeys allows and disallows.

Obviously it'd be nice to be able to sort by non-Var expressions at
lower levels too. But again, it'd be easy to change that if it's not
safe, so it'd be helpful if you could point to areas that don't bubble
non-Vars up so that we can actually comment about that in the source
and discuss specifics.


I think the proble here might be "what we think is theoretically safe"
vs. "what the planner expects". I wouldn't be surprised if this was safe
in theoretical sense, but planner not expecting that.

I wonder if whatever part of the incremental sort patch allowed
computation of tlist expressions below the top level of the join tree
ought to just be reverted outright, but that might be overkill. I
can't say that I really understand exactly what's going on here, but
it seems like a pretty significant behavior change to make as part of
another project, and I wonder if we're going to keep finding other
problems with it.


It certainly was not my intention to make such change in this patch. I
don't think the earlier parts of the incremental sort have this issue
because all those places were already constructing regular sort before.

But this code is different because it considers adding sort node (both
regular and incremental) below a gather merge, which was not happening
before. I suspect this is the problem, somehow.


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: enable_incremental_sort changes query behavior

Reply via email to