Hello everyone,

With the commit 556f7b7bc18d34ddec45392965c3b3038206bb62, Any plpgsql
function that returns scalar value would not be able to use parallelism to
evaluate a return statement. It will not be considered for parallel
execution because we are passing maxtuples = 2 to exec_run_select from
exec_eval_expr to evaluate the return expression of the function.

Call stake to ExecutePlan -

#0  ExecutePlan (queryDesc=0x589c390, operation=CMD_SELECT,
sendTuples=true, numberTuples=2, direction=ForwardScanDirection,
dest=0xe15ca0 <spi_printtupDR>) at execMain.c:1654

#1  0x000000000075edb6 in standard_ExecutorRun (queryDesc=0x589c390,
direction=ForwardScanDirection, count=2, execute_once=true) at
execMain.c:366

#2  0x00007f5749c9b8d8 in explain_ExecutorRun (queryDesc=0x589c390,
direction=ForwardScanDirection, count=2, execute_once=true) at
auto_explain.c:334

#3  0x000000000075ec25 in ExecutorRun (queryDesc=0x589c390,
direction=ForwardScanDirection, count=2, execute_once=true) at
execMain.c:310

#4  0x00000000007c4a48 in _SPI_pquery (queryDesc=0x589c390,
fire_triggers=true, tcount=2) at spi.c:2980

#5  0x00000000007c44a9 in _SPI_execute_plan (plan=0x5878780,
options=0x7ffc6ad467e0, snapshot=0x0, crosscheck_snapshot=0x0,
fire_triggers=true) at spi.c:2747

#6  0x00000000007c135f in SPI_execute_plan_with_paramlist (plan=0x5878780,
params=0x0, read_only=false, tcount=2) at spi.c:765

#7  0x00007f5749eb4a8b in exec_run_select (estate=0x7ffc6ad46ba0,
expr=0x5892b80, maxtuples=2, portalP=0x0) at pl_exec.c:5840 <-- maxtuples =
2

#8  0x00007f5749eb46fe in exec_eval_expr (estate=0x7ffc6ad46ba0,
expr=0x5892b80, isNull=0x7ffc6ad46bc0, rettype=0x7ffc6ad46bc4,
rettypmod=0x7ffc6ad468e8) at pl_exec.c:5734


Consider the following simple repro –



postgres=# create table test_tab(a int);

CREATE TABLE

postgres=# insert into test_tab (a) SELECT generate_series(1, 1000000);

INSERT 0 1000000

postgres=# analyse test_tab;

ANALYZE

postgres=# create function test_plpgsql() returns int

language plpgsql

as

$$

begin

return (select count(*) from test_tab where a between 5.0 and 999999.0);

end;

$$;

postgres=# LOAD 'auto_explain';

LOAD

postgres=# SET auto_explain.log_min_duration = 0;

SET

postgres=# SET auto_explain.log_analyze = true;

SET

postgres=# SET auto_explain.log_nested_statements = true;

SET

postgres=# select test_plpgsql();

test_plpgsql

--------------

999995

(1 row)



Plan logged in logfile -

    Query Text: (select count(*) from test_tab where a between 5.0 and
999999.0)

    Result  (cost=13763.77..13763.78 rows=1 width=8) (actual
time=912.108..912.110 rows=1 loops=1)

      InitPlan 1

        ->  Finalize Aggregate  (cost=13763.76..13763.77 rows=1 width=8)
(actual time=912.103..912.104 rows=1 loops=1)

              ->  Gather  (cost=13763.54..13763.75 rows=2 width=8) (actual
time=912.096..912.098 rows=1 loops=1)

                    Workers Planned: 2

                    *Workers Launched: 0*

                    ->  Partial Aggregate  (cost=12763.54..12763.55 rows=1
width=8) (actual time=912.095..912.096 rows=1 loops=1)

                          ->  Parallel Seq Scan on test_tab
(cost=0.00..12758.33 rows=2083 width=0) (actual time=0.022..812.253
rows=999995 loops=1)

                                Filter: (((a)::numeric >= 5.0) AND
((a)::numeric <= 999999.0))

                                Rows Removed by Filter: 5



Patch to fix this issue is attached. Proposed fix should not cause any
regression because the number of returned rows is anyway being checked
later inside exec_eval_expr(…).


Plan logged after fix –



Query Text: (select count(*) from test_tab where a between 5.0 and 999999.0)

    Result  (cost=13763.77..13763.78 rows=1 width=8) (actual
time=324.397..328.007 rows=1.00 loops=1)

      InitPlan 1

        ->  Finalize Aggregate  (cost=13763.76..13763.77 rows=1 width=8)
(actual time=324.391..327.999 rows=1.00 loops=1)

              ->  Gather  (cost=13763.54..13763.75 rows=2 width=8) (actual
time=324.052..327.989 rows=3.00 loops=1)

                    Workers Planned: 2

                    *Workers Launched: 2*

                    ->  Partial Aggregate  (cost=12763.54..12763.55 rows=1
width=8) (actual time=320.254..320.255 rows=1.00 loops=3)

                          ->  Parallel Seq Scan on test_tab
(cost=0.00..12758.33 rows=2083 width=0) (actual time=0.029..286.410
rows=333331.67 loops=3)

                                Filter: (((a)::numeric >= 5.0) AND
((a)::numeric <= 999999.0))

                                Rows Removed by Filter: 2

Thanks & Regards,

Dipesh

Attachment: 0001-Allow-parallelism-for-plpgsql-return-expression-afte.patch
Description: Binary data

Reply via email to