On Mon, Jan 29, 2007 at 01:38:02PM +0000, Gregory Stark wrote: > Instead I suggest we create one type of reentrant node, which memoizes its > output. It would be a kind of on-demand Materialize. It would be paired with a > second node type which would be the only type of node which can invoke it. > This RecursiveCall node would invoke the Memoize node using a special > interface in which it passes enough state for the Memoize node to seek to the > correct place in its output.
That I beleive is the right approach. I think an equivalent way of looking at it is a "Loop" node with an InitPlan and a StepPlan. Initially, the Loop node executes the InitPlan to get it's initial set of tuples, storing them like a Materialize node does. After that it keeps calling the StepNode, where somewhere inside the it has a node that extracts a row from the aforementioned tuplestore. It stores these returned tuples in the tuplestore also, thus giving you recursion. <snip> > (I've convinced myself that that's true but I should probably work out > a good proof of it before I make all this depend on it.) Yeah, proving it is going to be tricky, I'm not sure what the standard says about infinite recursion. > There are three general cases of the Memoize node. Breadth-first, Depth-first, > and non-linearly-recursive. I think the the only difference between depth and bredth-first searches is (if you consider the tuplesort to be a queue) whether the new tuples go to the front or the back of the list. But a data structures and algorithms book will know this. > There are a few open issues to consider. Namely, how to cost a RecursiveCall > node. One note: if you know that if you get p tuples out for every tuple in (where p<1) then the asymptotic result of 1 + p + p*p+ ... is 1/(1-p) However, I don't know it matters. You only need to cost the plan if there are alternate paths and given the plan structure is strongly constrained, I'm not sure how much it matters. > Also, if a subplan has exactly one call-site we really ought to inline it as > it will get much more reliable plans. Similarly if there are two call sites we > should consider inlining it if the cost of materializing the results (and > reading them back) is more than n-call-sites x the cost of executing the > query. I would expect That would happen with plain sequential scans for > example. In the case where the subplan has side-effects, you can't optimise at all. In the case of read-committed mode, will two seq-scans always return the same result? Hope this helps, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate.
signature.asc
Description: Digital signature