Re: [HACKERS] Some notes about Param handling with "Oracle style" plpgsql variables

Pavel Stehule Mon, 02 Nov 2009 11:05:44 -0800

2009/11/2 Tom Lane <t...@sss.pgh.pa.us>:
> One of the interesting properties of Oracle-compatible variable
> references in plpgsql is that the set of variables referenced by a
> given query could change during a forced replan.  For example,
> consider
>
>        declare x int;
>                r record;
>        ...
>        for r in select x,y from tab loop ...
>
> If tab contains a column "x" then the "x" reference in the SELECT
> refers to tab.x; if not, it refers to the plpgsql variable x.
> So when first executing the SELECT we might find that it requires
> a Param reference to the plpgsql variable, and then after a replan
> is forced by ALTER TABLE tab ADD COLUMN x, there is no need for
> the Param anymore.  Or vice versa.
>
> This kinda calls into question whether the Oracle way is actually
> a good idea or not; but my purpose here is not to debate that,
> just to look at what it takes to implement it.


This is reason, why I would to see third mode (incompatible with
Oracle or pg), that raise error, when it detects any intersecting
identifiers. I understand so this mode should not be default, but
personally I'll use it everywhere.

Pavel

>
> Currently, plpgsql generates a list of the variables referenced by
> any SQL statement or expression immediately upon seeing the text,
> before it's ever even fed to the core parser.  I had been envisioning
> having the parser callback hook construct the list on-the-fly during
> parsing, but the possibility that the list will change from time to
> time means that other changes are needed too.  Notably:
>
> 1. plancache.c does not have any provision for letting the Param type
> array associated with a stored statement change when the statement is
> replanned due to SI invalidation.
>
> 2. The control flow for a replan is that plpgsql calls SPI_execute_plan,
> which calls RevalidateCachedPlan, which does the replan if the cached
> plan is discovered to be stale.  However, plpgsql already had to set up
> the list of actual parameter values before it called SPI_execute_plan,
> which means it is *way* too late to change the list of required Params
> even if plancache let us do it.
>
> After chewing on these facts for awhile, I am thinking that the best
> solution is for plpgsql to abandon the notion of a predetermined list
> of parameters for a SQL query altogether.  What that list basically
> provides is a mapping from Param numbers ($n) to plpgsql "datum numbers"
> (indexes in the list of a plpgsql function's variables).  We could make
> that mapping always be one-to-one, since there's no real reason that the
> Params available to a query have to be consecutively numbered.  So the
> transformColumnRef hook would just pass back a Param using the
> referenced variable's datum number as paramid; it wouldn't bother at all
> with building a data structure listing the specific variables actually
> used in the query.
>
> As far as plancache goes, it would therefore always see a null array
> of Param type OIDs associated with a plpgsql-generated query, and we'd
> not have to provide a way to update that.  (We'd still keep the ability
> to store such an array, because most other callers of plancache will
> still want a fixed list of Params.)  What we'd have to add to plancache
> instead is the ability to install caller-determined parser callback
> hooks when it is calling the parser for a replan.  This seems fairly
> easy to do --- I'm envisioning a sort of meta-hook function that gets
> called with the new ParseState and can insert hook function pointers
> in it.
>
> The other issue with this is what to do at runtime.  We could do it
> with no other changes if we had plpgsql always set up Values/Nulls
> arrays listing *every* datum's current value.  This seems a bit
> brute-force though --- it could be slow in a function with a lot of
> variables, and in most cases any specific query or expression would
> not need most of those values.  What I think we should do instead
> is extend the ParamListInfo structure to add a callback hook function
> that populates individual ParamExternData array entries on-demand.
> The core executor would call the hook when it tried to fetch the
> value of a Param that was currently invalid (ptype == 0).  So the
> hook would be invoked only once per query per referenced parameter,
> which shouldn't be much overhead.  Another interesting property
> of this approach is that it'd fix the longstanding user complaint
> that constructions like
>        if (TG_OP = 'INSERT' and NEW.foo = 'bar') ...
> fail prematurely.  The executor would never demand the value
> of NEW.foo, and thus not fail, if TG_OP isn't INSERT.
>
> Comments?
>
>                        regards, tom lane
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some notes about Param handling with "Oracle style" plpgsql variables

Reply via email to