On Thu, Nov 26, 2015 at 8:11 AM, Robert Haas <robertmh...@gmail.com> wrote: > > Attached find a patch that does (mostly) two things. >
I have started looking into this and would like to share few findings with you: - + /* + * Primitive parallel cost model. Assume the leader will do half as much + * work as a regular worker, because it will also need to read the tuples + * returned by the workers when they percolate up to the gather ndoe. + * This is almost certainly not exactly the right way to model this, so + * this will probably need to be changed at some point... + */ + if (path->parallel_degree > 0) + path->rows = path->rows / (path->parallel_degree + 0.5); + if (!enable_seqscan) startup_cost += disable_cost; @@ -225,16 +234,6 @@ cost_seqscan(Path *path, PlannerInfo *root, cpu_per_tuple = cpu_tuple_cost + qpqual_cost.per_tuple; run_cost += cpu_per_tuple * baserel->tuples; - /* - * Primitive parallel cost model. Assume the leader will do half as much - * work as a regular worker, because it will also need to read the tuples - * returned by the workers when they percolate up to the gather ndoe. - * This is almost certainly not exactly the right way to model this, so - * this will probably need to be changed at some point... - */ - if (nworkers > 0) - run_cost = run_cost / (nworkers + 0.5); - .. Above and changes in add_path() makes planner *not* to select parallel path for seq scan where earlier it was possible. I think you want to change the costing of parallel plans based on rows selected instead of total_cost, but there seems to be some problem in the logic (I think gather node is not taking into account the reduced cost). Consider below case: create table tbl_parallel_test(c1 int, c2 char(1000)); insert into tbl_parallel_test values(generate_series(1,1000000),'aaaaa'); Analyze tbl_parallel_test; set max_parallel_degree=6; Explain select count(*) from tbl_parallel_test where c1 < 10000; Without patch, it is able to use parallel plan for above case and with patch, it is not able to use it. - There seems to be some inconsistency in Explain's output when multiple workers are used. Case -1 Consider the table is populated as mentioned above. change max_worker_processes=2 in postgresql.conf set max_parallel_degree=4; Explain (Analyze,Verbose) select count(*) from tbl_parallel_test where c1 < 10000; QUERY PL AN -------------------------------------------------------------------------------- --------------------------------------------------------------------------- Aggregate (cost=46227.78..46227.79 rows=1 width=0) (actual time=182583.554..18 2583.555 rows=1 loops=1) Output: count(*) -> Gather (cost=1000.00..46203.83 rows=9579 width=0) (actual time=167930.03 9..182571.654 rows=9999 loops=1) Output: c1, c2 Number of Workers: 4 -> Parallel Seq Scan on public.tbl_parallel_test (cost=0.00..44245.93 rows=2129 width=0) (actual time=167904.516..182498.494 rows=3333 loops=3) Output: c1, c2 Filter: (tbl_parallel_test.c1 < 10000) Rows Removed by Filter: 330000 Worker 0: actual time=167890.584..182491.043 rows=4564 loops=1 Worker 1: actual time=167893.651..182461.904 rows=2740 loops=1 Planning time: 0.121 ms Execution time: 182588.419 ms (13 rows) 1. Rows removed by Filter should be 990001. 2. Total rows = 9999 at Gather node are right, but it is not obvious how the rows by each worker and leader leads to that total. Case-2 change max_worker_processes=8 in postgresql.conf set max_parallel_degree=4; postgres=# Explain (Analyze,Verbose) select count(*) from tbl_parallel_test wher e c1 < 10000; QUERY PLAN -------------------------------------------------------------------------------- ---------------------------------------------------------------------- Aggregate (cost=46227.78..46227.79 rows=1 width=0) (actual time=39365.233..393 65.234 rows=1 loops=1) Output: count(*) -> Gather (cost=1000.00..46203.83 rows=9579 width=0) (actual time=47.485..3 9344.574 rows=9999 loops=1) Output: c1, c2 Number of Workers: 4 -> Parallel Seq Scan on public.tbl_parallel_test (cost=0.00..44245.93 rows=2129 width=0) (actual time=11.910..39262.255 rows=2000 loops=5) Output: c1, c2 Filter: (tbl_parallel_test.c1 < 10000) Rows Removed by Filter: 198000 Worker 0: actual time=5.931..39249.068 rows=3143 loops=1 Worker 1: actual time=5.778..39254.504 rows=1687 loops=1 Worker 2: actual time=0.836..39264.683 rows=2170 loops=1 Worker 3: actual time=1.101..39251.459 rows=1715 loops=1 Planning time: 0.123 ms Execution time: 39383.296 ms (15 rows) The problems reported in previous case are visible in this case as well. I think both are due to same problem Case -3 postgres=# Explain (Analyze,Verbose,Buffers) select count(*) from tbl_parallel_t est where c1 < 10000; QUERY PLAN -------------------------------------------------------------------------------- ---------------------------------------------------------------------- Aggregate (cost=46227.78..46227.79 rows=1 width=0) (actual time=33607.146..336 07.147 rows=1 loops=1) Output: count(*) Buffers: shared hit=548 read=142506 -> Gather (cost=1000.00..46203.83 rows=9579 width=0) (actual time=33.983..3 3592.030 rows=9999 loops=1) Output: c1, c2 Number of Workers: 4 Buffers: shared hit=548 read=142506 -> Parallel Seq Scan on public.tbl_parallel_test (cost=0.00..44245.93 rows=2129 width=0) (actual time=13.447..33354.099 rows=2000 loops=5) Output: c1, c2 Filter: (tbl_parallel_test.c1 < 10000) Rows Removed by Filter: 198000 Buffers: shared hit=352 read=142506 Worker 0: actual time=18.422..33322.132 rows=2170 loops=1 Buffers: shared hit=4 read=30765 Worker 1: actual time=0.803..33283.979 rows=1890 loops=1 Buffers: shared hit=1 read=26679 Worker 2: actual time=0.711..33360.007 rows=1946 loops=1 Buffers: shared hit=197 read=30899 Worker 3: actual time=15.057..33252.605 rows=2145 loops=1 Buffers: shared hit=145 read=25433 Planning time: 0.217 ms Execution time: 33612.964 ms (22 rows) I am not able to understand how buffer usage add upto what is shown at Gather node. - I think it would be better if we add some explanation to Explain - Verbose section and an Example on the same page in documentation. This can help users to understand this feature. It would be better if we can split this patch into multiple patches like Explain related changes, Append pushdown related changes, Join Push down related changes. You can choose to push the patches as you prefer, but splitting can certainly help in review/verification of the code. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com