(2019/03/09 5:36), Tom Lane wrote:
Etsuro Fujita<fujita.ets...@lab.ntt.co.jp>  writes:
(2019/02/28 0:52), Robert Haas wrote:
On Tue, Feb 26, 2019 at 7:26 AM Etsuro Fujita
<fujita.ets...@lab.ntt.co.jp>   wrote:
The parallel safety of the final scan/join target is determined from the
grouping target, not that target, which [ is wrong ]

Your patch looks right to me.

I think it would be better for you to take this one.  Could you?

I concur with Robert that this is a brown-paper-bag-grade bug in
960df2a97.  Please feel free to push (and don't forget to back-patch).

OK, will do.

The only really interesting question is whether it's worth adding
a regression test for.  I doubt it; the issue seems much too narrow.
Usually the point of a regression test is to prevent re-introduction
of the same/similar bug, but what class of bugs would you argue
we'd be protecting against?

Plan degradation; without the fix, we would have this on data created by "pgbench -i -s 10 postgres", as shown in [1]:

postgres=# set parallel_setup_cost = 10;
postgres=# set parallel_tuple_cost = 0.001;

postgres=# explain verbose select aid+bid, sum(abalance), random() from pgbench_accounts group by aid+bid;
                                                 QUERY PLAN

------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=137403.01..159903.01 rows=1000000 width=20)
   Output: ((aid + bid)), sum(abalance), random()
   Group Key: ((pgbench_accounts.aid + pgbench_accounts.bid))
   ->  Sort  (cost=137403.01..139903.01 rows=1000000 width=8)
         Output: ((aid + bid)), abalance
         Sort Key: ((pgbench_accounts.aid + pgbench_accounts.bid))
         ->  Gather  (cost=10.00..24070.67 rows=1000000 width=8)
-->            Output: (aid + bid), abalance
               Workers Planned: 2
-> Parallel Seq Scan on public.pgbench_accounts (cost=0.00..20560.67 rows=416667 width=12)
                     Output: aid, bid, abalance
(11 rows)

The final scan/join target {(aid + bid), abalance} would be parallel safe, but in the plan the target is not parallelized across workers. The reason for that is because the parallel-safety of the target is assessed incorrectly using the grouping target {((aid + bid)), sum(abalance), random()}, which would not be parallel safe. By the fix we would have this:

postgres=# explain verbose select aid+bid, sum(abalance), random() from pgbench_accounts group by aid+bid;
                                                QUERY PLAN

-----------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=135944.68..158444.68 rows=1000000 width=20)
   Output: ((aid + bid)), sum(abalance), random()
   Group Key: ((pgbench_accounts.aid + pgbench_accounts.bid))
   ->  Sort  (cost=135944.68..138444.68 rows=1000000 width=8)
         Output: ((aid + bid)), abalance
         Sort Key: ((pgbench_accounts.aid + pgbench_accounts.bid))
         ->  Gather  (cost=10.00..22612.33 rows=1000000 width=8)
               Output: ((aid + bid)), abalance
               Workers Planned: 2
-> Parallel Seq Scan on public.pgbench_accounts (cost=0.00..21602.33 rows=416667 width=8)
-->                  Output: (aid + bid), abalance
(11 rows)

Note that the final scan/join target is parallelized in the plan.

This would only affect plan quality a little bit, so I don't think we need a regression test for this, either, but the fix might destabilize existing plan choices, so we should apply it to HEAD only?

Best regards,
Etsuro Fujita


Reply via email to