Dear community, I need a bit of advice on parallel query execution. I have a table with roughly 2 million rows in it. These rows are distributed over a set of IDs. Therefore I can group my rows based on the ID set.
If I query the table directly like this: select mediatrunkid,count(*)::numeric from reports.mediatrunkkpi where mediatrunkid in (2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028) group by 1; mediatrunkid | count --------------+------- 409612 | 11862 409613 | 11862 2101260 | 11860 2101261 | 11882 2101263 | 11809 2109452 | 11751 2117644 | 11833 2125836 | 11832 2125839 | 11852 2125841 | 11882 2129932 | 11882 2129933 | 11854 2134028 | 11718 2138124 | 11850 2142220 | 11861 2154508 | 11825 2187276 | 11826 (17 rows) Time: 44,056 ms I see from the query plan that it executed nicely in parallel using two workers: 2018-02-06 08:18:47.381 CET [18898] LOG: duration: 43.072 ms plan: Query Text: select mediatrunkid,count(*)::numeric from reports.mediatrunkkpi where mediatrunkid in (2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028) group by 1; Finalize GroupAggregate (cost=57784.41..57792.66 rows=300 width=40) (actual time=40.583..40.596 rows=17 loops=1) Group Key: mediatrunkid Buffers: shared hit=3076 -> Sort (cost=57784.41..57785.91 rows=600 width=16) (actual time=40.578..40.582 rows=51 loops=1) Sort Key: mediatrunkid Sort Method: quicksort Memory: 27kB Buffers: shared hit=3076 -> Gather (cost=57693.72..57756.72 rows=600 width=16) (actual time=40.534..40.561 rows=51 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=3076 -> Partial HashAggregate (cost=56693.72..56696.72 rows=300 width=16) (actual time=33.901..33.909 rows=17 loops=3) Group Key: mediatrunkid Buffers: shared hit=6006 -> Parallel Bitmap Heap Scan on mediatrunkkpi (cost=4525.01..56279.28 rows=82889 width=8) (actual time=4.575..23.145 rows=67080 loops=3) Recheck Cond: (mediatrunkid = ANY ('{2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028}'::bigint[])) Heap Blocks: exact=2253 Buffers: shared hit=6006 -> Bitmap Index Scan on idx_mediatrunkkpi_trunk (cost=0.00..4475.27 rows=198933 width=0) (actual time=10.057..10.057 rows=201241 loops=1) Index Cond: (mediatrunkid = ANY ('{2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028}'::bigint[])) Buffers: shared hit=823 But when I do the very same within a function: create or replace function reports.generic_query(_sql text) RETURNS SETOF record LANGUAGE 'plpgsql' PARALLEL SAFE COST 100 SET "TimeZone"='utc' STABLE ROWS 10000 AS $BODY$ DECLARE -- BEGIN RETURN QUERY EXECUTE _sql; END $BODY$; select * from reports.generic_query($$ select mediatrunkid,count(*)::numeric from reports.mediatrunkkpi where mediatrunkid in (2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028) group by 1 $$) as foo (trunkid bigint, count numeric); trunkid | count ---------+------- 2109452 | 11751 409613 | 11862 2125839 | 11852 2125841 | 11882 2142220 | 11861 2117644 | 11833 2154508 | 11825 2134028 | 11718 2101263 | 11809 2101261 | 11882 2129933 | 11854 2129932 | 11882 2125836 | 11832 2138124 | 11850 409612 | 11862 2187276 | 11826 2101260 | 11860 (17 rows) Time: 86,275 ms the execution time almost doubles, and the query no longer is considered for parallel execution: 2018-02-06 08:20:30.553 CET [18898] LOG: duration: 85.358 ms plan: Query Text: select mediatrunkid,count(*)::numeric from reports.mediatrunkkpi where mediatrunkid in (2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028) group by 1 HashAggregate (cost=60900.33..60904.08 rows=300 width=40) (actual time=85.341..85.348 rows=17 loops=1) Group Key: mediatrunkid Buffers: shared hit=6006 -> Bitmap Heap Scan on mediatrunkkpi (cost=4525.01..59905.66 rows=198933 width=8) (actual time=9.084..39.655 rows=201241 loops=1) Recheck Cond: (mediatrunkid = ANY ('{2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028}'::bigint[])) Heap Blocks: exact=5183 Buffers: shared hit=6006 -> Bitmap Index Scan on idx_mediatrunkkpi_trunk (cost=0.00..4475.27 rows=198933 width=0) (actual time=8.386..8.386 rows=201241 loops=1) Index Cond: (mediatrunkid = ANY ('{2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028}'::bigint[])) Buffers: shared hit=823 2018-02-06 08:20:30.553 CET [18898] CONTEXT: PL/pgSQL function reports.generic_query(text) line 5 at RETURN QUERY 2018-02-06 08:20:30.553 CET [18898] LOG: duration: 85.986 ms plan: Query Text: select * from reports.generic_query($$ select mediatrunkid,count(*)::numeric from reports.mediatrunkkpi where mediatrunkid in (2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028) group by 1 $$) as foo (trunkid bigint, count numeric); Function Scan on generic_query foo (cost=0.25..100.25 rows=10000 width=40) (actual time=85.968..85.970 rows=17 loops=1) Buffers: shared hit=6006 This whole topic around parallel execution is fairly new, but from the documentation I would assume that functions declared as parallel safe, will really be used in parallel. And even if the function itself is not invoked in parallel (maybe does not even make sense here), the function body for sure should run in parallel if I'm not mistaken. So what do I obviously do wrong here? Regards, Michael