On Tue, Jun 5, 2018 at 6:24 PM, Dmitry Dolgov <9erthali...@gmail.com> wrote: >> On 5 June 2018 at 12:31, Amit Langote <langote_amit...@lab.ntt.co.jp> wrote: >> >> doesn't look quite right. What says expr is really a Param? The patch >> appears to work because, by setting pinfo->execparams to *something*, it >> triggers execution-time pruning to run; its contents aren't necessarily >> used during execution pruning. In fact, it would've crashed if the >> execution-time pruning code had required execparams to contain *valid* >> param id, but currently it doesn't. >> >> What I think we'd need to do to make this work is to make execution-time >> pruning be invoked even if there aren't any Params involved. IOW, let's >> try to teach make_partition_pruneinfo that it can go ahead also in the >> cases where there are expressions being compared with the partition key >> that contain (only) stable functions. Then, go and fix the >> execution-pruning code to not *always* expect there to be Params to prune >> with. > > Yeah, I agree - I copied this approach mindlessly from the original hacky > patch. So, looks like it's necessary to have something like got_stable_expr > together with gotparam.
I think the current code is heavily relying on Params to be present for partition pruning, which isn't true. Runtime partition pruning is possible when there are comparison conditions with partition key expressions on one side and "execution time constant" expressions on the other side. By "execution time constant" expression, I mean any expression that evaluates to a constant at the time of execution like a stable expressions (not just functions) or a Param expression. I can think of only these two at this time, but there can be more. So, gotparam should be renamed as "gotprunable_cond" to be generic. pull_partkey_params() should be renamed as "pull_partkey_conds" or something generic. That function would return true if there exists an expression in steps which can be evaluated to a constant at runtime, otherwise it returns false. My guess is there will be false-positives which need to be dealt with later, but there will be no false-negatives. > And after that the only place where I see Params > are in use is partkey_datum_from_expr where all the stuff is actually > evaluated. So apparently this part about "fix the execution-pruning code to > not > *always* expect there to be Params to prune with" will be only about this > function - am I correct or there is something else that I missed? Yes. But I think trying to evaluate parameters in this function is not good. The approach of folding constant expressions before or immediately after the execution starts doesn't require the expressions to be evaluated in partkey_datum_from_expr and might benefit other places where stable expressions or params can appear. Other problem with partkey_datum_from_expr() seems to be that it evaluated only param nodes but not the expressions involving parameters which can folded into constants at runtime. Take for example following queries on table t1 with two partitions (0, 100) and (100, 200), populated using "insert into t1 select i, i from generate_series(0, 199) i;". There's an index on t1(a). explain analyze select * from t1 x left join t1 y on x.a = y.b where y.a = 5; QUERY PLAN ------------------------------------------------------------------------------------------------------------ Nested Loop (cost=0.00..6.78 rows=1 width=16) (actual time=0.033..0.066 rows=1 loops=1) -> Append (cost=0.00..2.25 rows=1 width=8) (actual time=0.019..0.035 rows=1 loops=1) -> Seq Scan on t1p1 y (cost=0.00..2.25 rows=1 width=8) (actual time=0.018..0.035 rows=1 loops=1) Filter: (a = 5) Rows Removed by Filter: 99 -> Append (cost=0.00..4.51 rows=2 width=8) (actual time=0.011..0.027 rows=1 loops=1) -> Seq Scan on t1p1 x (cost=0.00..2.25 rows=1 width=8) (actual time=0.006..0.022 rows=1 loops=1) Filter: (y.b = a) Rows Removed by Filter: 99 -> Seq Scan on t1p2 x_1 (cost=0.00..2.25 rows=1 width=8) (never executed) Filter: (y.b = a) Planning Time: 0.644 ms Execution Time: 0.115 ms (13 rows) t1p2 x_1 is never scanned indicating that run time partition pruning happened. But then see the following query postgres:17889=#explain analyze select * from t1 x left join t1 y on x.a = y.b + 100 where y.a = 5; QUERY PLAN -------------------------------------------------------------------------------------------------------------- Nested Loop (cost=0.00..7.28 rows=1 width=16) (actual time=0.055..0.093 rows=1 loops=1) -> Append (cost=0.00..2.25 rows=1 width=8) (actual time=0.017..0.034 rows=1 loops=1) -> Seq Scan on t1p1 y (cost=0.00..2.25 rows=1 width=8) (actual time=0.016..0.033 rows=1 loops=1) Filter: (a = 5) Rows Removed by Filter: 99 -> Append (cost=0.00..5.01 rows=2 width=8) (actual time=0.034..0.054 rows=1 loops=1) -> Seq Scan on t1p1 x (cost=0.00..2.50 rows=1 width=8) (actual time=0.026..0.026 rows=0 loops=1) Filter: ((y.b + 100) = a) Rows Removed by Filter: 100 -> Seq Scan on t1p2 x_1 (cost=0.00..2.50 rows=1 width=8) (actual time=0.007..0.027 rows=1 loops=1) Filter: ((y.b + 100) = a) Rows Removed by Filter: 99 Planning Time: 0.424 ms Execution Time: 0.139 ms (14 rows) The scan on t1p1 x returns no rows and should have been pruned since y.b + 100 is constant for a given y.b. But for this to work, folding constant expressions doesn't help since y.b changes with every rescan of t1 x. So may be we need some way to constant fold expression during ExecutorRewind() as well. This is digression from the original report, but it's still within the scope of "why partition pruning doesn't work?" -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company