On 29 April 2016 at 02:41, Robert Haas <robertmh...@gmail.com> wrote: > On Wed, Apr 27, 2016 at 1:05 PM, Daniel Verite <dan...@manitou-mail.org> > wrote: >> Robert Haas wrote: >>> Of course, we could make this value 1-based rather than 0-based, as >>> Peter Geoghegan suggested a while back. But as I think I said at the >>> time, I think that's more misleading than helpful. The leader >>> participates in the parallel plan, but typically does far less of the >>> work beneath the Gather node than the other nodes involved in the >>> query, often almost none. In short, the leader is special. >>> Pretending that it's just another process involved in the parallel >>> group isn't doing anyone a favor. >> >> FWIW, that's not how it looks from the outside (top or vmstat). >> I'm ignorant about how parallel tasks are assigned in the planner, >> but when trying various values for max_parallel_degree and running >> simple aggregates on large tables on a single 4 core CPU doing >> nothing else, I'm only ever seeing max_parallel_degree+1 processes >> indiscriminately at work, often in the same state (R running or >> D waiting for disk). > > Right, but they're probably not doing the SAME work. You can look at > EXPLAIN (ANALYZE, VERBOSE, BUFFERS) to see. Of course, all the work > above the Gather node is being done by the leader, but the stuff below > the Gather node often has a bit of participation from the leader, but > is mostly the workers.
Robert, I'd imagine that most of your tests to make you think what you do would have come from testing parallel seq scan, where perhaps Daniel's comes from testing something like parallel aggregates, or at least something that gives the workers a decent amount of work per tuple returned. With the setting; # set max_parallel_degree = 8; Given a table like; # create table t1 (num int not null); Populated with; # insert into t1 select generate_Series(1,10000000); Given the query; # explain (analyze, verbose) select count(*) from t1; if we think about what'll happen here, each worker will go off and grab all of the tuples it can and aggregate each one, meanwhile the main process would otherwise be quite idle waiting for the workers to come back with their partially aggregated results, so instead, to keep itself busy, goes off and helps them out. We can see this is true in the explain analyze verbose output; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------ Finalize Aggregate (cost=80508.54..80508.55 rows=1 width=8) (actual time=605.019..605.019 rows=1 loops=1) Output: pg_catalog.count(*) -> Gather (cost=80508.13..80508.53 rows=4 width=8) (actual time=604.799..605.011 rows=5 loops=1) Output: (count(*)) Workers Planned: 4 Workers Launched: 4 -> Partial Aggregate (cost=79508.13..79508.13 rows=1 width=8) (actual time=585.099..585.100 rows=1 loops=5) Output: count(*) Worker 0: actual time=579.736..579.736 rows=1 loops=1 Worker 1: actual time=580.669..580.669 rows=1 loops=1 Worker 2: actual time=580.512..580.513 rows=1 loops=1 Worker 3: actual time=580.649..580.649 rows=1 loops=1 -> Parallel Seq Scan on public.t1 (cost=0.00..72456.10 rows=2820810 width=0) (actual time=2.310..404.978 rows=2000000 loops=5) Output: num Worker 0: actual time=2.231..397.251 rows=1892702 loops=1 Worker 1: actual time=3.107..403.436 rows=1983602 loops=1 Worker 2: actual time=3.030..403.082 rows=1952188 loops=1 Worker 3: actual time=3.135..404.756 rows=2039650 loops=1 If we look at total the number of rows that each of the workers managed to chew through, and subtract from the total rows in t1; # select 10000000 - (1892702 + 1983602 + 1952188 + 2039650); ?column? ---------- 2131858 (1 row) So the main process managed to get through 2131858 rows while waiting for the helpers finishing their work. Daniel looks right. Another example, this time no aggregation; postgres=# explain (analyze,verbose) select * from t1 where num > 9500000; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------ Gather (cost=1000.00..126718.15 rows=502200 width=4) (actual time=454.071..694.967 rows=500000 loops=1) Output: num Workers Planned: 4 Workers Launched: 4 -> Parallel Seq Scan on public.t1 (cost=0.00..76753.65 rows=125550 width=4) (actual time=430.240..451.990 rows=100000 loops=5) Output: num Filter: (t1.num > 9500000) Rows Removed by Filter: 1900000 Worker 0: actual time=421.220..449.077 rows=130128 loops=1 Worker 1: actual time=428.549..456.059 rows=125430 loops=1 Worker 2: actual time=421.372..447.751 rows=113904 loops=1 Worker 3: actual time=427.140..454.118 rows=130402 loops=1 # select 500000 - (130128 + 125430 + 113904 + 130402); ?column? ---------- 136 (1 row) The main process only managed to get time for 136 rows! Robert is right. So I'd say this very much depends on how busy the main process is pulling rows from each worker. It would also be quite nice if we could see at a glance how much the main process did, without having to go subtracting what all the workers managed to do. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers