Hello,

I think we've found a few existing problems with handling the parallel safety 
of functions while doing an experiment.  Could I hear your opinions on what we 
should do?  I'd be willing to create and submit a patch to fix them.

The experiment is to add a parallel safety check in FunctionCallInvoke() and 
run the regression test with force_parallel_mode=regress.  The added check 
errors out with ereport(ERROR) when the about-to-be-called function is parallel 
unsafe and the process is currently in parallel mode.  6 test cases failed 
because the following parallel-unsafe functions were called:

    dsnowball_init
    balkifnull
    int44out
    text_w_default_out
    widget_out

The first function is created in src/backend/snowball/snowball_create.sql for 
full text search.  The remaining functions are created during the regression 
test run.

The relevant issues follow.


(1)
All the above functions are actually parallel safe looking at their 
implementations.  It seems that their CREATE FUNCTION statements are just 
missing PARALLEL SAFE specifications, so I think I'll add them.  
dsnowball_lexize() may also be parallel safe.


(2)
I'm afraid the above phenomenon reveals that postgres overlooks parallel safety 
checks in some places.  Specifically, we noticed the following:

* User-defined aggregate
CREATE AGGREGATE allows to specify parallel safety of the aggregate itself and 
the planner checks it, but the support function of the aggregate is not 
checked.  OTOH, the document clearly says:

https://www.postgresql.org/docs/devel/xaggr.html

"Worth noting also is that for an aggregate to be executed in parallel, the 
aggregate itself must be marked PARALLEL SAFE. The parallel-safety markings on 
its support functions are not consulted."

https://www.postgresql.org/docs/devel/sql-createaggregate.html

"An aggregate will not be considered for parallelization if it is marked 
PARALLEL UNSAFE (which is the default!) or PARALLEL RESTRICTED. Note that the 
parallel-safety markings of the aggregate's support functions are not consulted 
by the planner, only the marking of the aggregate itself."

Can we check the parallel safety of aggregate support functions during 
statement execution and error out?  Is there any reason not to do so?

* User-defined data type
The input, output, send,receive, and other functions of a UDT are not checked 
for parallel safety.  Is there any good reason to not check them other than the 
concern about performance?

* Functions for full text search
Should CREATE TEXT SEARCH TEMPLATE ensure that the functions are parallel safe? 
 (Those functions could be changed to parallel unsafe later with ALTER 
FUNCTION, though.)


(3) Built-in UDFs are not checked for parallel safety
The functions defined in fmgr_builtins[], which are derived from pg_proc.dat, 
are not checked.  Most of them are marked parallel safe, but some are paralel 
unsaferestricted.

Besides, changing their parallel safety with ALTER FUNCTION PARALLEL does not 
affect the selection of query plan.  This is because fmgr_builtins[] does not 
have a member for parallel safety.

Should we add a member for parallel safety in fmgr_builtins[], and disallow 
ALTER FUNCTION to change the parallel safety of builtin UDFs?


Regards
Takayuki Tsunakawa




Reply via email to