Hi Tom, On Wed, 22 Sep 2021, at 17:09, Tom Lane wrote: > > The main change is a switch to using SPI for expression evaluation. The > > plans are also cached along the same lines as the RI trigger plans. > > I really dislike this implementation technique. Aside from the likely > performance hit for existing triggers, I think it opens serious security > holes, because we can't fully guarantee that deparse-and-reparse doesn't > change the semantics. For comparison, the RI trigger code goes to > ridiculous lengths to force exact parsing of the queries it makes, > and succeeds only because it needs just a very stylized subset of SQL. > With a generic user-written expression, we'd be at risk for several > inherently-ambiguous SQL constructs such as IS DISTINCT FROM (see > relevant reading at [1]).
Where do you consider the performance hit to be? I just read the code again. The only time the new code paths are hit are when a FOR EACH STATEMENT trigger fires that has a WHEN condition. Given the existing restrictions for such a scenario, that can only mean a WHEN condition that is a simple function call; so, "SELECT foo()" vs "foo()"? Or have I misunderstood? Regarding the deparse-and-reparse --- if I understand correctly, the core problem is that we have no way of going from a node tree to a string, such that the string is guaranteed to have the same meaning as the node tree? (I did try just now to produce such a scenario with the patch but I couldn't get ruleutils to emit the wrong thing). Moreover, we couldn't store the string for use with SPI, as the string would be subject to trigger-time search path lookups. That pretty much rules out SPI for this then. Do you have a suggestion for an alternative? I guess it would be go to the planner/executor directly with the node tree? > In general, users may expect that > once those are parsed by the accepting DDL command, they'll hold still, > not get re-interpreted at runtime. I agree entirely. I wasn't aware of the deparsing/reparsing problem. > ... > (Relevant to that, I wonder why this patch is only concerned with > WHEN clauses and not all the other places where we forbid subqueries > for implementation simplicity.) I don't know how many other places WHEN clauses are used. Rules, perhaps? The short answer is this patch was written to solve a specific problem I had rather than it being a more general attempt to remove all "subquery forbidden" restrictions. > > > b. If a WHEN expression is defined as "n = (SELECT ...)", there is the > > possibility that a user gets the error "more than one row returned by a > > subquery used as an expression" when performing DML, which would be rather > > cryptic if they didn't know there was a trigger involved. To avoid this, > > we could disallow scalar expressions, with a hint to use the ANY/ALL > > quantifiers. > > How is that any more cryptic than any other error that can occur > in a WHEN expression? It isn't. What *is* different about it, is that -- AFAIK -- it is the only cryptic message that can come about due to the syntactic structure of an expression. Yes, someone could have a function that does "RAISE ERROR 'foo'". There's not a lot that can be done about that. But scalar subqueries are detectable and they have an obvious rewrite using the quantifiers, hence the suggestion. However, it was just that; a suggestion. -Joe