On Wed, Oct 07, 2020 at 03:52:04PM -0400, Robert Haas wrote:
On Thu, Oct 1, 2020 at 9:03 AM James Coleman <jtc...@gmail.com> wrote:
The plan (obtained by replacing the volatile function with a stable one):
Unique
-> Nested Loop
-> Gather Merge
Workers Planned: 2
-> Sort
Sort Key: ref_0.i, (md5(ref_0.t))
-> Parallel Index Scan using ref_0_i_idx on ref_0
-> Function Scan on generate_series ref_1
Changing `md5(ref_0.t)` to `random()::text || ref_0.t` causes the error.
As far as I remember this stuff, target lists for any nodes below the
top of the join tree were previously always just Var nodes. Any
expressions that needed to be computed were computed at the level of
the topmost join, before adding upper planner steps like Agg,
WindowAgg, Limit, etc. But, here you've got a complex expression --
md5(ref_0.t) -- begin computed somewhere in the middle of the plan
tree so that the sort can be performed at that point. However, that
seems likely to break a bunch of assumptions all over the planner,
because, unless a lot of work and surgery has been done since I looked
at this last, that md5(ref_0.t) value is not going to "bubble up" to
higher levels of the plan through the tlists of the individual nodes.
That poses a risk that it will be recomputed multiple times, which is
particularly bad for volatile functions because you might not always
get the same answer, but it seems like it might be bad even for
non-volatile functions, because it could be slow if the value is used
at multiple places in the query. It feels like the sort of thing that
might have broken some other assumptions, too, although I can't say
exactly what.
I agree this might be a clear correctness issue for volatile functions.
Not sure about stable functions, though - it's easy to construct cases
where pushing the function down results in much faster execution.
Consider this:
create table t (id int, val text);
insert into t select 1, repeat('x', 1000) from generate_series(1,100) s(i);
-- no pushdown
select id, md5(t1.val) from t t1 join t t2 using (id) join t t3 using (id);
Time: 5222.895 ms (00:05.223)
-- manual pushdown
with x as (select id, md5(t.val) from t)
select id, t1.md5 from x t1 join x t2 using (id) join x t3 using (id);
Time: 486.455 ms
Of course, this is independent of what the planner assumes correctly,
which is what's being discussed in this thread. And I'm sure there are
plenty counter-examples too, so we'd need to cost this properly.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services