(2019/03/09 5:36), Tom Lane wrote:
Etsuro Fujita<fujita.ets...@lab.ntt.co.jp> writes:
(2019/02/28 0:52), Robert Haas wrote:
On Tue, Feb 26, 2019 at 7:26 AM Etsuro Fujita
<fujita.ets...@lab.ntt.co.jp> wrote:
The parallel safety of the final scan/join target is determined from the
grouping target, not that target, which [ is wrong ]
Your patch looks right to me.
I think it would be better for you to take this one. Could you?
I concur with Robert that this is a brown-paper-bag-grade bug in
960df2a97. Please feel free to push (and don't forget to back-patch).
OK, will do.
The only really interesting question is whether it's worth adding
a regression test for. I doubt it; the issue seems much too narrow.
Usually the point of a regression test is to prevent re-introduction
of the same/similar bug, but what class of bugs would you argue
we'd be protecting against?
Plan degradation; without the fix, we would have this on data created by
"pgbench -i -s 10 postgres", as shown in [1]:
postgres=# set parallel_setup_cost = 10;
postgres=# set parallel_tuple_cost = 0.001;
postgres=# explain verbose select aid+bid, sum(abalance), random() from
pgbench_accounts group by aid+bid;
QUERY PLAN
------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=137403.01..159903.01 rows=1000000 width=20)
Output: ((aid + bid)), sum(abalance), random()
Group Key: ((pgbench_accounts.aid + pgbench_accounts.bid))
-> Sort (cost=137403.01..139903.01 rows=1000000 width=8)
Output: ((aid + bid)), abalance
Sort Key: ((pgbench_accounts.aid + pgbench_accounts.bid))
-> Gather (cost=10.00..24070.67 rows=1000000 width=8)
--> Output: (aid + bid), abalance
Workers Planned: 2
-> Parallel Seq Scan on public.pgbench_accounts
(cost=0.00..20560.67 rows=416667 width=12)
Output: aid, bid, abalance
(11 rows)
The final scan/join target {(aid + bid), abalance} would be parallel
safe, but in the plan the target is not parallelized across workers.
The reason for that is because the parallel-safety of the target is
assessed incorrectly using the grouping target {((aid + bid)),
sum(abalance), random()}, which would not be parallel safe. By the fix
we would have this:
postgres=# explain verbose select aid+bid, sum(abalance), random() from
pgbench_accounts group by aid+bid;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------
GroupAggregate (cost=135944.68..158444.68 rows=1000000 width=20)
Output: ((aid + bid)), sum(abalance), random()
Group Key: ((pgbench_accounts.aid + pgbench_accounts.bid))
-> Sort (cost=135944.68..138444.68 rows=1000000 width=8)
Output: ((aid + bid)), abalance
Sort Key: ((pgbench_accounts.aid + pgbench_accounts.bid))
-> Gather (cost=10.00..22612.33 rows=1000000 width=8)
Output: ((aid + bid)), abalance
Workers Planned: 2
-> Parallel Seq Scan on public.pgbench_accounts
(cost=0.00..21602.33 rows=416667 width=8)
--> Output: (aid + bid), abalance
(11 rows)
Note that the final scan/join target is parallelized in the plan.
This would only affect plan quality a little bit, so I don't think we
need a regression test for this, either, but the fix might destabilize
existing plan choices, so we should apply it to HEAD only?
Best regards,
Etsuro Fujita