On Wed, Feb 28, 2018 at 11:53 PM, Amit Langote <langote_amit...@lab.ntt.co.jp> wrote: > Attached updated patches.
+ memcpy(part_scheme->partsupfunc, partkey->partsupfunc, + sizeof(FmgrInfo) * partnatts); You can't copy an FmgrInfo by just applying memcpy() to it. Use fmgr_info_copy. I don't like the comments at the top of partprune.c very much. It seems strange to document individual functions here; those functions can (and should) be documented in their individual header comments. What we should have at the top of the file is a discussion of the overall theory of operation of this module, something that currently seems not to exist anywhere in the patch. I tried to figure that out by looking at the new data structures the patch introduces: PartitionPruneContext, PartScanKeyInfo, PartitionClauseInfo, and PartClause. It looks like the general idea idea is that code that wants to use these facilities does the following: Step 1. Generate a PartitionPruneContext. In this patch, this seems to consist entirely of copying information from the RelOptInfo or its PartitionScheme. Step 2. Call generate_partition_clauses() to extract relevant clauses from rel->baserestrictinfo and generate a PartClauseInfo. Step 3. Call get_partitions_from_clauses() to generate a set of unpruned partition indexes. Internally, that function will first populate a PartScanKeyInfo from the PartClauseInfo by calling extract_bounding_datums(). Then it calls get_partitions_for_keys() which generates the set of unpruned partition indexes from the PartitionPruneContext and PartScanKeyInfo. I guess there are two main things about this that seem odd to me: 1. I don't see why the partition pruning code should ever be responsible for evaluating anything itself, as it does currently via evaluate_expr(). For plan-time partition pruning, we are already using eval_const_expressions() to perform as much Const-simplification as possible. If we see an OpExpr with a partitioning column on one side, then the other side is either a Const, in which case we can perform pruning, or it's something whose value can't be known until runtime, in which case we cannot. There should be no case in which eval_const_expressions() fails to simplify something to a constant yet we know the value at plan time; if such a case exists, then it's an opportunity to improve eval_const_expressions(), not a reason to do be inconsistent with the rules it applies. For run-time pruning, it is probably semantically wrong and certainly undesirable from a performance perspective to spin up a new ExecutorState and a new ExprState every time we need to reassess the decision about which partitions need to be scanned. Instead, the expressions whose values are inputs to the pruning process should be computed in the ExecutorState/ExprState for the main query and the results should be passed to the partition-pruning code as inputs to the decision-making process. 2. All processing of clauses should happen at plan time, not run time, so that we're not repeating work. If, for example, a prepared query is executed repeatedly, the clauses should get fully processed when it's planned, and then the only thing that should happen at execution time is that we take the values to which we now have access and use them to decide what to prune. With this design, we're bound to repeat at least the portion of the work done by get_partitions_from_clauses() at runtime, and as things stand, it looks like the current version of the run-time partition pruning patch repeats the generate_partition_clauses() work at runtime as well. It seems to me as though what we ought to be doing is extracting clauses that are of the correct general form and produces a list of <partition-column-index>, <partition-operator-strategy>, <expression> triplets sorted by increasing partition column index and operator strategy number and in a form that can be represented as a Node tree. This can be attached to the plan for use at runtime or, if there are at least some constants among the expressions, used at plan time for partition exclusion. So the main interfaces would be something like this: extern List *extract_partition_pruning_clauses(RelOptInfo *rel); extern PartitionPruneContext *CreatePartitionPruneContext(List *); extern Bitmapset *PerformStaticPartitionPruning(PartitionPruneContext *); extern Bitmapset *PerformDynamicPartitionPruning(PartitionPruneContext *, Datum *values, bool *isnull); In PerformStaticPartitionPruning(), we'd just ignore anything for which the expression was non-constant; to call PerformDynamicPartitionPruning(), the caller would need to evaluate the expressions from the PartitionPruneContext using the appropriate EState and then pass the results into this function. I realize that I'm hand-waving a bit here, or maybe more than a bit, but in general I think it's right to imagine that the work here needs to be segregated into two very well-separated phases. In the first phase, we do all possible work to assess which clauses are relevant and put them in a form where the actual pruning work can be done as cheaply as possible once we know the values. This phase always happens at plan time. In the second phase, we get the values, either by extracting them from Const nodes at plan time or evaluating expressions at runtime, and then decide which partitions to eliminate. This phase happens at plan time if we have enough constants available to do something useful and at runtime if we have enough non-constants to do something useful. Right now there doesn't seem to be a clean separation between these two phases and I think that's not good. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company