On Sat, Feb 7, 2015 at 2:30 AM, Robert Haas <robertmh...@gmail.com> wrote: > > On Fri, Feb 6, 2015 at 2:13 PM, Robert Haas <robertmh...@gmail.com> wrote: > > The complicated part here seems to me to figure out what we need to > > pass from the parallel leader to the parallel worker to create enough > > state for quals and projection. If we want to be able to call > > ExecScan() without modification, which seems like a good goal, we're > > going to need a ScanState node, which is going to need to contain > > valid pointers to (at least) a ProjectionInfo, an ExprContext, and a > > List of quals. That in turn is going to require an ExecutorState. > > Serializing those things directly doesn't seem very practical; what we > > instead want to do is figure out what we can pass that will allow easy > > reconstruction of those data structures. Right now, you're passing > > the target list, the qual list, the range table, and the params, but > > the range table doesn't seem to be getting used anywhere. I wonder if > > we need it. If we could get away with just passing the target list > > and qual list, and params, we'd be doing pretty well, I think. But > > I'm not sure exactly what that looks like. > > IndexBuildHeapRangeScan shows how to do qual evaluation with > relatively little setup: >
I think even to make quals work, we need to do few extra things like setup paramlist, rangetable. Also for quals, we need to fix function id's by calling fix_opfuncids() and do the stuff what ExecInit*() function does for quals. I think these extra things will be required in processing of qualification for seq scan. Then we need to construct projection info from target list (basically do the stuff what ExecInit*() function does). After constructing projectioninfo, we can call ExecProject(). Here we need to take care that functions to collect instrumentation information like InstrStartNode(), InstrStopNode(), InstrCountFiltered1(), etc. be called at appropriate places, so that we can collect the same for Explain statement when requested by master backend. Then finally after sending tuples need to destroy all the execution state constructed for fetching tuples. So to make this work, basically we need to do all important work that executor does in three different phases initialization of node, execution of node, ending the node. Ideally, we can make this work by having code specific to just execution of sequiatial scan, however it seems to me we again need more such kinds of code (extracted from core part of executor) to make parallel execution of other functionalaties like aggregation, partition seq scan, etc. Another idea is to use Executor level interfaces (like ExecutorStart(), ExecutorRun(), ExecutorEnd()) for execution rather than using Portal level interfaces. I have used Portal level interfaces with the thought that we can reuse the existing infrastructure of Portal to make parallel execution of scrollable cursors, but as per my analysis it is not so easy to support them especially backward scan, absolute/ relative fetch, etc, so Executor level interfaces seems more appealing to me (something like how Explain statement works (ExplainOnePlan)). Using Executor level interfaces will have advantage that we can reuse them for other parallel functionalaties. In this approach, we need to take care of constructing relavant structures (with the information passed by master backend) required for Executor interfaces, but I think these should be lesser than what we need in previous approach (extract seqscan specific stuff from executor). With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com