Re: Why overhead of SPI is so large?

Tom Lane Mon, 11 Nov 2019 09:24:00 -0800

Konstantin Knizhnik <k.knizh...@postgrespro.ru> writes:
>> čt 7. 11. 2019 v 13:03 odesílatel Kyotaro Horiguchi 
>> <horikyota....@gmail.com <mailto:horikyota....@gmail.com>> napsal:
>>> I might be too worrying, but maybe we should write the function in
>>> white-listed way, that is, expr_needs_snapshot returns false only if
>>> the whole tree consists of only the node types known to the
>>> function. And any unknown node makes the function return true
>>> immediately.


> There are  62 cases handled by expression_tree_walker.
> I have inspected all of them and considered only these 6 to be 
> "dangerous": intentionally require snapshot.
> FuncExpr, OpExpr, Query, SubPlan, AlternativeSubPlan, SubLink.

This is false on its face.  For example, considering OpExpr but not
its aliases DistinctExpr or NullIfExpr is just obviously wrong.
ScalarArrayOpExpr is another node that might contain a user-defined
function.  CoerceViaIO calls I/O functions that might easily not be
immutable.  RowCompareExpr calls comparison functions that might
not be either (they are btree comparison functions, but they could
be cross-type comparisons, and we only require those to be stable).
CoerceToDomain could invoke just about anything at all.  Need I
go on?

> Frankly speaking I do not this that it is good idea. New nodes are 
> rarely added to the Postgres and expression_tree_walker
> in any case has to be changed to handle this new nodes. There are many 
> other places where tree walker is used to check presence (or absence)
> of some properties in the tree. And none is this places assume that some 
> newly introduced node may require special handling of this property check.

None of those statements are true, in my experience.

In general, this patch seems like it's learned nothing from our
experiences with the late and unlamented exec_simple_check_node()
(cf commit 00418c612).  Having a piece of plpgsql that has to know
about all possible "simple expression" node types is just a major
maintenance headache; but there is no short-cut solution that is
really going to be acceptable.  Either you're unsafe, or you fail
to optimize cases that users will complain about, or you write
and maintain a lot of code.

I'm also pretty displeased with the is-it-in-the-pg_catalog-schema
tests.  Those are expensive, requiring additional catalog lookups,
and they prove just about nothing.  There are extensions that shove
stuff into pg_catalog (look no further than contrib/adminpack), or
users could do it, so you really still are relying on the whole world
to get immutability right.  If we're going to not trust immutability
markings on user-defined objects, I'd be inclined to do it by
checking that the object's OID is less than FirstGenbkiObjectId.

But maybe we should just forget that bit of paranoia and rely solely
on contain_mutable_functions().  That gets rid of the new maintenance
requirement, and I think arguably it's OK.  An "immutable" function
whose result depends on changes that were just made by the calling
function is pretty obviously mislabeled, so people won't have much of
a leg to stand on if they complain about that.  Pavel's argument
upthread that people sometimes intentionally mislabel functions as
immutable isn't really relevant: in his example, at least, the user
is intentionally trying to get the function to be evaluated early.
So whether it sees the effects of changes just made by the calling
function doesn't seem like it would matter to such a user.

                        regards, tom lane

Re: Why overhead of SPI is so large?

Reply via email to