On Fri, Mar 2, 2018 at 10:54 AM, David Rowley <david.row...@2ndquadrant.com> wrote: > I don't think building it is going to cost a huge amount. Presumably, > there are not many partitioned tables with 10 rows, so probably having > the get_partitions_from_clauses work as quickly as possible is better > than saving 100 nanoseconds in executor startup.
I agree that one could go overboard with trying to push work from executor time to planner time, but I don't think the current patch is very close to the point of diminishing returns. It's doing nearly everything at execution time. > That being said, there's still a small issue with the run-time pruning > patch which is caused by me not pre-processing the clauses during > planning. Ideally, I'd be able to pre-process at least enough to > determine if any Params match the partition key so that I know if > run-time pruning can be used or not. As of now, I'm not doing that as > it seems wasteful to pre-process during planning just to get the Param > Ids out, then not be able to carry the pre-processed ones over to the > executor. We also can't really reuse the pre-processed state that was > generated during the planner's calls to generate_partition_clauses() > since we'll additionally also be passing in the parameterized path > clauses as well as the baserestrictinfo clauses. I think it should be possible to have a structure where all the work of classifying clauses happens in the planner. By the time we get to execution time, we should be able to know for sure which clauses are relevant. For example, if the user says WHERE a = $1 + 3 AND b = (random() * 100)::int, and the partition key is (a, b), we should be able to figure out at plan time that the clause containing b is useless (because it's volatile) and the clause containing a is useful only if this is range-partitioning (because with hash-partitioning we must have an equality clause for every partition to do anything). I think it should also be possible to know which expressions need to be computed at runtime -- in this case, $1 + 3 -- and to which columns of the partition key they correspond -- in this case, the first. I just proposed a data representation which could track all that stuff and I'm sure there are other ways to do it, too. I think that things like PartClause that include both an opno and various bits of cached information, including FmgrInfo, are not a very good idea. A lot of work has been done to maintain the separation of immutable information -- like Plans or Exprs -- from the run-time state they use -- PlanState or ExprState. I think we would do well to follow that distinction here, too, even if it seems to introduce some "silly" overhead at execution time. I think it will pay for itself in future code maintenance and the ability to apply optimizations such as JIT which benefit from good divisions in this case. It is not crazy to imagine that the "pruning program" idea I floated in a previous email could be folded into the JIT stuff Andres is doing where something with a less-clean separation of concerns would run into problems. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company