Hi Tom,

On Wed, 22 Sep 2021, at 17:09, Tom Lane wrote:
> > The main change is a switch to using SPI for expression evaluation.  The 
> > plans are also cached along the same lines as the RI trigger plans.
> 
> I really dislike this implementation technique.  Aside from the likely
> performance hit for existing triggers, I think it opens serious security
> holes, because we can't fully guarantee that deparse-and-reparse doesn't
> change the semantics.  For comparison, the RI trigger code goes to
> ridiculous lengths to force exact parsing of the queries it makes,
> and succeeds only because it needs just a very stylized subset of SQL.
> With a generic user-written expression, we'd be at risk for several
> inherently-ambiguous SQL constructs such as IS DISTINCT FROM (see
> relevant reading at [1]).

Where do you consider the performance hit to be? I just read the code again. 
The only time the new code paths are hit are when a FOR EACH STATEMENT trigger 
fires that has a WHEN condition. Given the existing restrictions for such a 
scenario, that can only mean a WHEN condition that is a simple function call; 
so, "SELECT foo()" vs "foo()"? Or have I misunderstood?

Regarding the deparse-and-reparse --- if I understand correctly, the core 
problem is that we have no way of going from a node tree to a string, such that 
the string is guaranteed to have the same meaning as the node tree? (I did try 
just now to produce such a scenario with the patch but I couldn't get ruleutils 
to emit the wrong thing).  Moreover, we couldn't store the string for use with 
SPI, as the string would be subject to trigger-time search path lookups.  That 
pretty much rules out SPI for this then.  Do you have a suggestion for an 
alternative? I guess it would be go to the planner/executor directly with the 
node tree?

>  In general, users may expect that
> once those are parsed by the accepting DDL command, they'll hold still,
> not get re-interpreted at runtime.

I agree entirely. I wasn't aware of the deparsing/reparsing problem.

> ...
> (Relevant to that, I wonder why this patch is only concerned with
> WHEN clauses and not all the other places where we forbid subqueries
> for implementation simplicity.)

I don't know how many other places WHEN clauses are used. Rules, perhaps? The 
short answer is this patch was written to solve a specific problem I had rather 
than it being a more general attempt to remove all "subquery forbidden" 
restrictions.

> 
> > b. If a WHEN expression is defined as "n = (SELECT ...)", there is the 
> > possibility that a user gets the error "more than one row returned by a 
> > subquery used as an expression" when performing DML, which would be rather 
> > cryptic if they didn't know there was a trigger involved.  To avoid this, 
> > we could disallow scalar expressions, with a hint to use the ANY/ALL 
> > quantifiers.
> 
> How is that any more cryptic than any other error that can occur
> in a WHEN expression?

It isn't.  What *is* different about it, is that -- AFAIK -- it is the only 
cryptic message that can come about due to the syntactic structure of an 
expression.  Yes, someone could have a function that does "RAISE ERROR 'foo'".  
There's not a lot that can be done about that.  But scalar subqueries are 
detectable and they have an obvious rewrite using the quantifiers, hence the 
suggestion. However, it was just that; a suggestion.

-Joe

Reply via email to