On Fri, May 17, 2019 at 12:45 AM Korry Douglas <ko...@me.com> wrote: > It sounds like there is no reliable way to get the information that I’m > looking for, is that right?
Correct. And if there were, it could only be used to write bugs. Let me see if I can demonstrate... I'll use the file_fdw patch from the link I gave before, and I'll add an elog(LOG) message to show when fileIterateForeignScan() runs. $ echo 1 > /tmp/t2 postgres=# create table t1 as select generate_series(1, 1000000)::int i; SELECT 1000000 postgres=# create server files foreign data wrapper file_fdw; CREATE SERVER postgres=# create foreign table t2 (n int) server files options (filename '/tmp/t2', format 'csv'); CREATE FOREIGN TABLE The relevant EXPLAIN output is harder to understand if the parallel leader participates, but it changes nothing important, so I'll turn that off first, and then see how it is run: postgres=# set parallel_leader_participation = off; SET postgres=# explain (analyze, verbose) select count(*) from (select * from t1 union all select * from t2) ss; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------- Finalize Aggregate (cost=14176.32..14176.33 rows=1 width=8) (actual time=234.023..234.023 rows=1 loops=1) Output: count(*) -> Gather (cost=14176.10..14176.31 rows=2 width=8) (actual time=233.840..235.079 rows=2 loops=1) Output: (PARTIAL count(*)) Workers Planned: 2 Workers Launched: 2 -> Partial Aggregate (cost=13176.10..13176.11 rows=1 width=8) (actual time=223.550..223.555 rows=1 loops=2) Output: PARTIAL count(*) Worker 0: actual time=223.432..223.443 rows=1 loops=1 Worker 1: actual time=223.667..223.668 rows=1 loops=1 -> Parallel Append (cost=0.00..11926.10 rows=500000 width=0) (actual time=0.087..166.669 rows=500000 loops=2) Worker 0: actual time=0.083..166.366 rows=499687 loops=1 Worker 1: actual time=0.092..166.972 rows=500314 loops=1 -> Parallel Seq Scan on public.t1 (cost=0.00..9425.00 rows=500000 width=0) (actual time=0.106..103.384 rows=500000 loops=2) Worker 0: actual time=0.123..103.106 rows=499686 loops=1 Worker 1: actual time=0.089..103.662 rows=500314 loops=1 -> Parallel Foreign Scan on public.t2 (cost=0.00..1.10 rows=1 width=0) (actual time=0.079..0.096 rows=1 loops=1) Foreign File: /tmp/numbers Foreign File Size: 2 b Worker 0: actual time=0.079..0.096 rows=1 loops=1 Planning Time: 0.219 ms Execution Time: 235.262 ms (22 rows) You can see the that Parallel Foreign Scan was only actually run by one worker. So if you were somehow expecting both of them to show up in order to produce the correct results, you have a bug. The reason that happened is because Parallal Append sent one worker to chew on t1, and another to chew on t2, but the scan of t2 was finished very quickly, so that worker then went to help out with t1. And for further proof of that, here's what I see in my server log (note only ever called twice, and in the same process): 2019-05-17 10:51:42.248 NZST [52158] LOG: fileIterateForeignScan 2019-05-17 10:51:42.248 NZST [52158] STATEMENT: explain analyze select count(*) from (select * from t1 union all select * from t2) ss; 2019-05-17 10:51:42.249 NZST [52158] LOG: fileIterateForeignScan 2019-05-17 10:51:42.249 NZST [52158] STATEMENT: explain analyze select count(*) from (select * from t1 union all select * from t2) ss; Therefore you can't allocate the work up front based on expected number of workers, even if it works in simple examples. Your node isn't necessarily the only node in the plan, and higher up nodes get to decide when, if at all, you run, in each worker. -- Thomas Munro https://enterprisedb.com