On Thu, Nov 16, 2017 at 12:24 AM, Andres Freund <and...@anarazel.de> wrote: > Hi, > > On 2017-11-15 13:48:18 -0500, Robert Haas wrote: >> I think that we need a little bit deeper analysis here to draw any >> firm conclusions. > > Indeed. > > >> I suspect that one factor is that many of the queries actually send >> very few rows through the Gather. > > Yep. I kinda wonder if the same result would present if the benchmarks > were run with parallel_leader_participation. The theory being what were > seing is just that the leader doesn't accept any tuples, and the large > queue size just helps because workers can run for longer. > I ran Q12 with parallel_leader_participation = off and could not get any performance improvement with the patches given by Robert.The result was same for head as well. The query plan also remain unaffected with the value of this parameter.
Here are the details of the experiment, TPC-H scale factor = 20, work_mem = 1GB random_page_cost = seq_page_cost = 0.1 max_parallel_workers_per_gather = 4 PG commit: 745948422c799c1b9f976ee30f21a7aac050e0f3 Please find the attached file for the explain analyse output for either values of parallel_leader_participation and patches. -- Regards, Rafia Sabih EnterpriseDB: http://www.enterprisedb.com/
with parallel_leader_participation = 1; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------- Limit (cost=1001.19..504469.79 rows=1 width=27) (actual time=21833.206..21833.207 rows=1 loops=1) -> GroupAggregate (cost=1001.19..3525281.42 rows=7 width=27) (actual time=21833.203..21833.203 rows=1 loops=1) Group Key: lineitem.l_shipmode -> Gather Merge (cost=1001.19..3515185.81 rows=576888 width=27) (actual time=4.388..21590.757 rows=311095 loops=1) Workers Planned: 4 Workers Launched: 4 -> Nested Loop (cost=1.13..3445472.82 rows=144222 width=27) (actual time=0.150..4399.384 rows=62247 loops=5) -> Parallel Index Scan using l_shipmode on lineitem (cost=0.57..3337659.69 rows=144222 width=19) (actual time=0.111..3772.865 rows=62247 loops=5) Index Cond: (l_shipmode = ANY ('{"REG AIR",RAIL}'::bpchar[])) Filter: ((l_commitdate < l_receiptdate) AND (l_shipdate < l_commitdate) AND (l_receiptdate >= '1995-01-01'::date) AND (l_receiptdate < '1996-01-01 00:00:00'::timestamp without time zone)) Rows Removed by Filter: 3367603 -> Index Scan using orders_pkey on orders (cost=0.56..0.75 rows=1 width=20) (actual time=0.009..0.009 rows=1 loops=311236) Index Cond: (o_orderkey = lineitem.l_orderkey) Planning time: 0.526 ms Execution time: 21835.922 ms (15 rows) postgres=# set parallel_leader_participation =0; SET postgres=# \i /data/rafia.sabih/dss/queries/12.sql QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------- Limit (cost=1001.19..504469.79 rows=1 width=27) (actual time=21179.065..21179.066 rows=1 loops=1) -> GroupAggregate (cost=1001.19..3525281.42 rows=7 width=27) (actual time=21179.064..21179.064 rows=1 loops=1) Group Key: lineitem.l_shipmode -> Gather Merge (cost=1001.19..3515185.81 rows=576888 width=27) (actual time=4.201..20941.385 rows=311095 loops=1) Workers Planned: 4 Workers Launched: 4 -> Nested Loop (cost=1.13..3445472.82 rows=144222 width=27) (actual time=0.187..5105.780 rows=77797 loops=4) -> Parallel Index Scan using l_shipmode on lineitem (cost=0.57..3337659.69 rows=144222 width=19) (actual time=0.149..4362.235 rows=77797 loops=4) Index Cond: (l_shipmode = ANY ('{"REG AIR",RAIL}'::bpchar[])) Filter: ((l_commitdate < l_receiptdate) AND (l_shipdate < l_commitdate) AND (l_receiptdate >= '1995-01-01'::date) AND (l_receiptdate < '1996-01-01 00:00:00'::timestamp without time zone)) Rows Removed by Filter: 4208802 -> Index Scan using orders_pkey on orders (cost=0.56..0.75 rows=1 width=20) (actual time=0.008..0.008 rows=1 loops=311187) Index Cond: (o_orderkey = lineitem.l_orderkey) Planning time: 0.443 ms Execution time: 21183.148 ms (15 rows)