On Thu, Dec 3, 2015 at 3:25 AM, Robert Haas <robertmh...@gmail.com> wrote: > > On Tue, Dec 1, 2015 at 7:21 AM, Amit Kapila <amit.kapil...@gmail.com> wrote: > > It would be better if we can split this patch into multiple patches like > > Explain related changes, Append pushdown related changes, Join > > Push down related changes. You can choose to push the patches as > > you prefer, but splitting can certainly help in review/verification of the > > code. > > I don't think it really makes sense to split the append push-down > changes from the join push-down changes; those share a great deal of > code.
Not an issue. I have started looking into parallel join patch and below are few findings: 1. There are few compilation errors in the patch. It seems patch needs to adapt the latest changes done in commit-edca44b1. 1>src/backend/optimizer/path/joinpath.c(420): error C2039: 'extra_lateral_rels' : is not a member of 'JoinPathExtraData' 1> E:\WorkSpace\PostgreSQL\master\postgresql\src\include\nodes/relation.h(1727) : see declaration of 'JoinPathExtraData' .. .. 2. Why consider_parallel_nestloop() doesn't consider materializing inner relation as we do in match_unsorted_outer()? I have generated a test as below where non-parallel Nestloop join is faster than parallel Nestloop join. I am using 'hydra' for testing this patch. CREATE TABLE t1(c1, c2) AS SELECT g, repeat('x', 5) FROM generate_series(1, 10000000) g; CREATE TABLE t2(c1, c2) AS SELECT g, repeat('x', 5) FROM generate_series(1, 2000000) g; Analyze t1; Analyze t2; Restart Server Connect with psql set enable_hashjoin=off; set enable_mergejoin=off; postgres=# Explain Analyze SELECT count(*) FROM t1 JOIN t2 ON t1.c1 = t2.c1 AND t1.c1 BETWEEN 100000 AND 100100; QUERY PLAN ------------------------------------------------------------------------------------------------------------- -------------- Aggregate (cost=3294864.21..3294864.21 rows=1 width=0) (actual time=42614.102..42614.102 rows=1 loops=1) -> Nested Loop (cost=0.00..3294864.16 rows=20 width=0) (actual time=4123.463..42614.084 rows=101 loops=1) Join Filter: (t1.c1 = t2.c1) Rows Removed by Join Filter: 201999899 -> Seq Scan on t2 (cost=0.00..30811.00 rows=2000000 width=4) (actual time=0.027..284.979 rows=2000000 loops=1) -> Materialize (cost=0.00..204053.41 rows=102 width=4) (actual time=0.000..0.008 rows=101 loops=2000000) -> Seq Scan on t1 (cost=0.00..204052.90 rows=102 width=4) (actual time=13.920..2024.684 rows=101 loops=1) Filter: ((c1 >= 100000) AND (c1 <= 100100)) Rows Removed by Filter: 9999899 Planning time: 0.085 ms Execution time: 42614.135 ms I have repeated the above statement 3 times and the above result is median of 3 runs. Restart Server Connect with psql set enable_hashjoin=off; set enable_mergejoin=off; set max_parallel_degree=4; Explain Analyze SELECT count(*) FROM t1 JOIN t2 ON t1.c1 = t2.c1 AND t1.c1 BETWEEN 100000 AND 100100; QUERY PLAN ------------------------------------------------------------------------------------------------------------- --------------------- Aggregate (cost=1311396.47..1311396.48 rows=1 width=0) (actual time=45736.973..45736.973 rows=1 loops=1) -> Gather (cost=1000.00..1311396.42 rows=20 width=0) (actual time=709.083..45736.925 rows=101 loops=1) Number of Workers: 4 -> Nested Loop (cost=0.00..1310394.42 rows=20 width=0) (actual time=436.460..11240.321 rows=20 loops=5) Join Filter: (t1.c1 = t2.c1) Rows Removed by Join Filter: 40399980 -> Parallel Seq Scan on t1 (cost=0.00..45345.09 rows=23 width=4) (actual time=425.178..425.232 rows=20 loops=5) Filter: ((c1 >= 100000) AND (c1 <= 100100)) Rows Removed by Filter: 1999980 -> Seq Scan on t2 (cost=0.00..30811.00 rows=2000000 width=4) (actual time=0.011..270.986 rows=2000000 loops=101) Planning time: 0.115 ms Execution time: 45737.863 ms I have repeated the above statement 3 times and the above result is median of 3 runs. Now here the point to observe is that non-parallel case uses both less Execution time and Planning time to complete the statement. There is a considerable increase in planning time without any benefit in execution. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com