On Wed, Dec 16, 2015 at 6:20 PM Amit Kapila <amit.kapil...@gmail.com> wrote:
>On Tue, Dec 15, 2015 at 7:31 PM, Robert Haas <robertmh...@gmail.com> wrote: >> >> On Mon, Dec 14, 2015 at 8:38 AM, Amit Kapila <amit.kapil...@gmail.com> wrote: > In any case, >I have done some more investigation of the patch and found that even >without changing query planner related parameters, it seems to give >bad plans (as in example below [1]). I think here the costing of rework each I have done some more testing using TPC-H benchmark (For some of the queries, specially for Parallel Hash Join), and Results summary is as below. *Planning Time(ms)* *Query* *Base* *Patch* TPC-H Q2 2.2 2.4 TPCH- Q3 0.67 0.71 TPCH- Q5 3.17 2.3 TPCH- Q7 2.43 2.4 *Execution Time(ms)* *Query* *Base* *Patch* TPC-H Q2 2826 766 TPCH- Q3 23473 24271 TPCH- Q5 21357 1432 TPCH- Q7 6779 1138 All Test files and Detail plan output is attached in mail q2.sql, q3.sql, q.5.sql ans q7.sql are TPCH benchmark' 2nd, 3rd, 5th and 7th query and Results with base and Parallel join are attached in q*_base.out and q*_parallel.out respectively. Summary: With TPC-H queries where ever Hash Join is pushed under gather Node, significant improvement is visible, with Q2, using 3 workers, time consumed is almost 1/3 of the base. I Observed one problem, with Q5 and Q7, there some relation and snapshot references are leaked and i am getting below warning, havn't yet looked into the issue. WARNING: relcache reference leak: relation "customer" not closed WARNING: relcache reference leak: relation "customer" not closed WARNING: relcache reference leak: relation "customer" not closed WARNING: Snapshot reference leak: Snapshot 0x2d1fee8 still referenced WARNING: Snapshot reference leak: Snapshot 0x2d1fee8 still referenced WARNING: Snapshot reference leak: Snapshot 0x2d1fee8 still referenced WARNING: relcache reference leak: relation "customer" not closed CONTEXT: parallel worker, PID 123413 WARNING: Snapshot reference leak: Snapshot 0x2d1fee8 still referenced CONTEXT: parallel worker, PID 123413 WARNING: relcache reference leak: relation "customer" not closed CONTEXT: parallel worker, PID 123412 WARNING: Snapshot reference leak: Snapshot 0x2d1fee8 still referenced CONTEXT: parallel worker, PID 123412 WARNING: relcache reference leak: relation "customer" not closed CONTEXT: parallel worker, PID 123411 WARNING: Snapshot reference leak: Snapshot 0x2d1fee8 still referenced CONTEXT: parallel worker, PID 123411 psql:q7.sql:40: WARNING: relcache reference leak: relation "customer" not closed psql:q7.sql:40: WARNING: Snapshot reference leak: Snapshot 0x2d1fee8 still referenced Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com On Wed, Dec 16, 2015 at 6:19 PM, Amit Kapila <amit.kapil...@gmail.com> wrote: > On Tue, Dec 15, 2015 at 7:31 PM, Robert Haas <robertmh...@gmail.com> > wrote: > > > > On Mon, Dec 14, 2015 at 8:38 AM, Amit Kapila <amit.kapil...@gmail.com> > wrote: > > > set enable_hashjoin=off; > > > set enable_mergejoin=off; > > > > [ ... ] > > > > > > > Now here the point to observe is that non-parallel case uses both less > > > Execution time and Planning time to complete the statement. There > > > is a considerable increase in planning time without any benefit in > > > execution. > > > > So, you forced the query planner to give you a bad plan, and then > > you're complaining that the plan is bad? > > > > Oh no, I wanted to check the behaviour of parallel vs. non-parallel > execution of joins. I think even if hash and merge join are set to > off, it should have picked up non-parallel NestLoop plan. In any case, > I have done some more investigation of the patch and found that even > without changing query planner related parameters, it seems to give > bad plans (as in example below [1]). I think here the costing of rework > each > worker has to do seems to be missing which is causing bad plans to > be selected over good plans. Another point is that with patch, the number > of > paths that we explore to get the cheapest path have increased, do you think > we should try to evaluate it? One way is we run some queries where there > are more number of joins and see the impact on planner time and other is we > try to calculate the increase in number of paths that planner explores. > > > [1] - > CREATE TABLE t1(c1, c2) AS SELECT g, repeat('x', 5) FROM > generate_series(1, 10000000) g; > > CREATE TABLE t2(c1, c2) AS SELECT g, repeat('x', 5) FROM > generate_series(1, 3000000) g; > > Analyze t1; > Analyze t2; > > Non-parallel case > > postgres=# Explain Analyze SELECT count(*) FROM t1 JOIN t2 ON t1.c1 = > t2.c1 AND t1.c1 BETWEEN 100 AND 200; > QUERY PLAN > > > > ------------------------------------------------------------------------------------------------------------- > ------------- > Aggregate (cost=261519.93..261519.94 rows=1 width=0) (actual > time=2779.965..2779.965 rows=1 loops=1) > -> Hash Join (cost=204052.91..261519.92 rows=1 width=0) (actual > time=2017.241..2779.947 rows=101 > loops=1) > Hash Cond: (t2.c1 = t1.c1) > -> Seq Scan on t2 (cost=0.00..46217.00 rows=3000000 width=4) > (actual time=0.073..393.479 > rows=3000000 loops=1) > -> Hash (cost=204052.90..204052.90 rows=1 width=4) (actual > time=2017.130..2017.130 rows=101 > loops=1) > Buckets: 1024 Batches: 1 Memory Usage: 12kB > -> Seq Scan on t1 (cost=0.00..204052.90 rows=1 width=4) > (actual time=0.038..2017.105 > rows=101 loops=1) > Filter: ((c1 >= 100) AND (c1 <= 200)) > Rows Removed by Filter: 9999899 > Planning time: 0.113 ms > Execution time: 2780.000 ms > (11 rows) > > > Parallel-case > set max_parallel_degree=4; > > postgres=# Explain Analyze SELECT count(*) FROM t1 JOIN t2 ON t1.c1 = > t2.c1 AND t1.c1 BETWEEN 100 AND 200; > QUERY PLAN > > > ---------------------------------------------------------------------------------------------------------------------------------- > Aggregate (cost=100895.52..100895.53 rows=1 width=0) (actual > time=67871.443..67871.443 rows=1 loops=1) > -> Gather (cost=1000.00..100895.52 rows=1 width=0) (actual > time=0.653..67871.287 rows=101 loops=1) > Number of Workers: 4 > -> Nested Loop (cost=0.00..99895.42 rows=1 width=0) (actual > time=591.408..16455.731 rows=20 loops=5) > Join Filter: (t1.c1 = t2.c1) > Rows Removed by Join Filter: 60599980 > -> Parallel Seq Scan on t1 (cost=0.00..45345.09 rows=0 > width=4) (actual time=433.350..433.386 rows=20 loops=5) > Filter: ((c1 >= 100) AND (c1 <= 200)) > Rows Removed by Filter: 1999980 > -> Seq Scan on t2 (cost=0.00..46217.00 rows=3000000 > width=4) (actual time=0.005..395.480 rows=3000000 loops=101) > Planning time: 0.114 ms > Execution time: 67871.584 ms > (12 rows) > > Without patch, parallel case > > set max_parallel_degree=4; > > Explain Analyze SELECT count(*) FROM t1 JOIN t2 ON t1.c1 = t2.c1 AND t1.c1 > BETWEEN 100 AND 200; > QUERY PLAN > > > -------------------------------------------------------------------------------------------------------------------------------------- > Aggregate (cost=103812.21..103812.22 rows=1 width=0) (actual > time=1207.043..1207.043 rows=1 loops=1) > -> Hash Join (cost=46345.20..103812.21 rows=1 width=0) (actual > time=428.632..1207.027 rows=101 loops=1) > Hash Cond: (t2.c1 = t1.c1) > -> Seq Scan on t2 (cost=0.00..46217.00 rows=3000000 width=4) > (actual time=0.034..375.670 rows=3000000 loops=1) > -> Hash (cost=46345.19..46345.19 rows=1 width=4) (actual > time=428.557..428.557 rows=101 loops=1) > Buckets: 1024 Batches: 1 Memory Usage: 13kB > -> Gather (cost=1000.00..46345.19 rows=1 width=4) (actual > time=0.287..428.476 rows=101 loops=1) > Number of Workers: 4 > -> Parallel Seq Scan on t1 (cost=0.00..45345.09 > rows=1 width=4) (actual time=340.139..425.591 rows=20 loops=5) > Filter: ((c1 >= 100) AND (c1 <= 200)) > Rows Removed by Filter: 1999980 > Planning time: 0.116 ms > Execution time: 1207.196 ms > (13 rows) > > > With Regards, > Amit Kapila. > EnterpriseDB: http://www.enterprisedb.com >
q2_base.out
Description: Binary data
q2_parallel.out
Description: Binary data
q3_base.out
Description: Binary data
q3_parallel.out
Description: Binary data
q5_base.out
Description: Binary data
q5_parallel.out
Description: Binary data
q7_base.out
Description: Binary data
q7_parallel.out
Description: Binary data
explain analyze select l_orderkey, sum(l_extendedprice * (1 - l_discount)) as revenue, o_orderdate, o_shippriority from customer, orders, lineitem where c_mktsegment = 'HOUSEHOLD' and c_custkey = o_custkey and l_orderkey = o_orderkey and o_orderdate < date '1995-03-15' and l_shipdate > date '1995-03-15' group by l_orderkey, o_orderdate, o_shippriority order by revenue desc, o_orderdate LIMIT 10;
EXPLAIN ANALYZE select n_name, sum(l_extendedprice * (1 - l_discount)) as revenue from customer, orders, lineitem, supplier, nation, region where c_custkey = o_custkey and l_orderkey = o_orderkey and l_suppkey = s_suppkey and c_nationkey = s_nationkey and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = 'AFRICA' and o_orderdate >= date '1994-01-01' and o_orderdate < date '1994-01-01' + interval '1' year group by n_name order by revenue desc;
EXPLAIN ANALYZE select supp_nation, cust_nation, l_year, sum(volume) as revenue from ( select n1.n_name as supp_nation, n2.n_name as cust_nation, extract(year from l_shipdate) as l_year, l_extendedprice * (1 - l_discount) as volume from supplier, lineitem, orders, customer, nation n1, nation n2 where s_suppkey = l_suppkey and o_orderkey = l_orderkey and c_custkey = o_custkey and s_nationkey = n1.n_nationkey and c_nationkey = n2.n_nationkey and ( (n1.n_name = 'VIETNAM' and n2.n_name = 'CHINA') or (n1.n_name = 'CHINA' and n2.n_name = 'VIETNAM') ) and l_shipdate between date '1995-01-01' and date '1996-12-31' ) as shipping group by supp_nation, cust_nation, l_year order by supp_nation, cust_nation, l_year
explain analyze select s_acctbal, s_name, n_name, p_partkey, p_mfgr, s_address, s_phone, s_comment from part, supplier, partsupp, nation, region where p_partkey = ps_partkey and s_suppkey = ps_suppkey and p_size = 23 and p_type like '%STEEL' and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = 'ASIA' and ps_supplycost = ( select min(ps_supplycost) from partsupp, supplier, nation, region where p_partkey = ps_partkey and s_suppkey = ps_suppkey and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = 'ASIA' ) order by s_acctbal desc, n_name, s_name, p_partkey LIMIT 100;
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers