[HACKERS] Refactoring parser/analyze.c

Tom Lane Fri, 22 Jun 2007 16:18:44 -0700

In connection with bug #3403
http://archives.postgresql.org/pgsql-bugs/2007-06/msg00114.php
I've come to the conclusion that we really shouldn't do *any* processing
of utility commands at parse analysis time; they should be left as
raw-grammar output trees until execution.


The key reason for this is that any processing we do that is dependent
on database state might be obsolete by the time of execution, and we
don't have any infrastructure for taking locks or otherwise checking the
up-to-dateness of a utility command tree.  The time delay involved could
be significant in the case of a command that is put into the plan cache
(eg, a statement in a plpgsql function), so this isn't an academic
concern.  I had already foreseen this and delayed the processing of
several utility commands (eg, CREATE INDEX, CREATE RULE) until runtime
as part of the plan-cache patch; but I left CREATE TABLE and ALTER TABLE
alone, mistakenly thinking that their parse analysis work was purely
syntactic transformations and so could be done without reference to the
database state.  As noted in the discussion of bug #3403, this is wrong
with respect to the processing of SERIAL-column sequences.  And there's
also the matter of CREATE TABLE ... LIKE, for which the CVS-HEAD code
says

 * Change the LIKE <subtable> portion of a CREATE TABLE statement into
 * column definitions which recreate the user defined column portions of
 * <subtable>.
 *
 * Note: because we do this at parse analysis time, any change in the
 * referenced table between parse analysis and execution won't be reflected
 * into the new table.  Is this OK?

So I'm thinking we should complete the break-up and delay the processing
done by transformCreateStmt and transformAlterTableStmt until execution
of the utility command begins.  In the case of ALTER TABLE we should
take out an exclusive lock on the target table before we even start to
do any of transformAlterTableStmt's work.

I had originally thought that parser/analyze.c was too intertwined to
try to break up, but upon looking more closely I find that there is
actually almost complete separation between the handling of plannable
commands and utility commands.  I would like to refactor analyze.c
into two files to reflect this new understanding of when things happen:

analyze.c: keeps parse_analyze, transformStmt, and the handling of
SELECT/INSERT/UPDATE/DELETE commands, as well as EXPLAIN and DECLARE
CURSOR, which are special cases but more nearly related to plannable
commands than not.

a new file named something like parse_utilcmd.c: transformCreateStmt,
transformAlterTableStmt, transformCreateSchemaStmt, transformIndexStmt,
transformRuleStmt, and subsidiary routines.  These functions would now
be called at the beginning of execution of the respective utility
commands, and not from parse_analyze() at all.

It looks like only release_pstate_resources() and makeFromExpr() are
used in common by these two files; both of them arguably belong
somewhere else anyway (parse_node.c and makefuncs.c respectively).
Also we might need to export transformStmt() from analyze.c; the
utility-command routines currently call that directly, and I'm undecided
whether they can or should go through parse_analyze() instead.

With this refactoring, there will not be any use of the
extras_before/extras_after mechanism within analyze.c, and I'm sorely
tempted to just rip it out, redeclaring parse_analyze() and friends
to return a single Query node instead of a List.  Can anyone foresee
a reason we might still need to return multiple Query nodes from a
single plannable statement?  (Note: "rule expansion" isn't a reason,
that happens later.)

Comments?

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

[HACKERS] Refactoring parser/analyze.c

Reply via email to