On 3 March 2018 at 04:47, Robert Haas <robertmh...@gmail.com> wrote: > On Fri, Mar 2, 2018 at 6:21 AM, Amit Langote > <langote_amit...@lab.ntt.co.jp> wrote: >> 2. Removing redundant clauses from each list, that is, >> remove_redundant_clauses() to produce lists with just one member per >> operator strategy for each partition key, > > I don't see that there's a real need to perform step 2 at all. I > mean, if you have x > $1 and x > $2 in the query, you can just compute > the set of partitions for the first clause, compute the set of > partitions for the second clause, and then intersect. That doesn't > seem obviously worse than deciding which of $1 and $2 is greater and > then pruning only based on whichever one is greater in this case.
This is an interesting idea. It may simplify the code quite a bit as the clause reduction code is quite bulky. However, unless I'm mistaken, for this to work, certain inputs will need significantly more processing to determine the minimum set of matching partitions. Let's look at the following perhaps unlikely case. (I picked an extreme case to demonstrate why this may be an inferior method) Given the table abc (...) partition by range (a,b,c), with the query: select * from abc where a >= 1 and a >= 2 and a >= 3 and b >= 1 and b >= 2 and b = 3 and c >= 1 and c >= 2 and c = 3; We would likely still be parsing those clauses into some struct like PartitionClauseInfo and would end up with some arrays or Lists with the clauses segmented by partition key. It appears to me, for your method to work we'd need to try every combination of the clauses matching each partition key, which in this case is 3 * 3 * 3 searches. Amit's current method is 1 search, after the clause reduction which is 3 + 3 + 3 (O(N) per partition key) I've tried to think of a more genuine poor performing case for this with IN or NOT IN lists, but I can't quite see it since NOT IN will only be supported by LIST partitioning, which can only have a single partition key and IN would be OR conditions, each of which would be evaluated in a different round of looping. Although I'm not ruling out that my imagination is just not good enough. With that considered, is it still a good idea to do it this way? Or maybe I've misunderstood the idea completely? -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services