[HACKERS] Read-only plan trees

Tom Lane Sun, 01 Dec 2002 16:22:09 -0800

Han Holl's recent complaint about memory leaks in SQL-language functions
has started me thinking again about making plan trees read-only to the
executor.  This would make it a lot easier to manage memory cleanly in
the SQL function executor, and would eliminate a lot of plan tree
copying that currently goes on in plpgsql, prepared queries, etc.


Basically, instead of having plan tree nodes point to associated
executor state nodes, we should turn that around: executor state should
point to plan nodes.  Executor startup should build a state-node tree
that exactly parallels the plan tree, and *all* data that is changed by
the executor should live in that tree.  We can build this tree in a
memory context that is made to have query lifetime.  At executor
shutdown, rather than individually pfree'ing lots of stuff (and having
memory leaks wherever we forget), we can just delete the query memory
context.

This is a nontrivial task, and so I plan to tackle it in several stages.

Step 1: restructure plannodes.h and execnodes.h so that there is an
executor state tree with entries for each "plan node".  This tree will
be built recursively during ExecInitNode() --- you pass it a plan tree,
and it returns a state tree that links to the plan tree nodes.
ExecutorRun then needs only a pointer to the state tree.

Step 2: similarly restructure trees for expressions (quals and
targetlists).  Currently we do not explicitly build a state tree for
expressions --- the objects that ought to be in this tree are the
"fcache" entries that are attached to OP_EXPR and FUNC_EXPR nodes in
an expression plan tree.  The fcache objects really need to be in the
executor's context however, and the cleanest way to make that happen
seems to be to build a state tree paralleling the expression plan tree.

But this is slightly inefficient, since there would be many nodes in the
expression state trees that aren't doing anything very useful, ie, all
the ones that correspond to nodes other than OP and FUNC in the plan
tree.

An alternative approach would be to make it work somewhat like Params
do now: in each OP and FUNC node, put an integer index field to replace
the current fcache pointer.  The planner would be responsible for
assigning sequential index values to every OP and FUNC in a plan, and
storing the total number of 'em in the plan's top node.  Then at
runtime, the executor would allocate an array of that many fcache
structs which it'd store in the EState for the plan.  Execution of
an individual op or func would index into the EState to find the fcache.

Either of these approaches would mean that we couldn't easily "just
execute" a scalar expression tree, which is something that we do in
quite a few places (constraint checking for instance).  There would need
to be some advance setup done.  With the Param-style approach, the
advance setup would not be read-only on the expression plan tree ...
which seems like a bad idea, so I'm leaning towards building the more
expensive data structure.

Step 3: only after all the above spadework is done could we actually set
up a query-lifetime memory context and build the executor's state in it.

Comments?

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

[HACKERS] Read-only plan trees

Reply via email to