RE: Problem with default partition pruning
Hi Thibaut, Thanks a lot for your test and comments. > > Le 28/02/2019 à 09:26, Imai, Yoshikazu a écrit : > > Hosoya-san > > > > On Wed, Feb 27, 2019 at 6:51 AM, Yuzuko Hosoya wrote: > >>> From: Amit Langote [mailto:langote_amit...@lab.ntt.co.jp] > >>> Sent: Wednesday, February 27, 2019 11:22 AM > >>> > >>> Hosoya-san, > >>> > >>> On 2019/02/22 17:14, Yuzuko Hosoya wrote: > >>>> Hi, > >>>> > >>>> I found the bug of default partition pruning when executing a range > >> query. > >>>> - > >>>> postgres=# create table test1(id int, val text) partition by range > >>>> (id); postgres=# create table test1_1 partition of test1 for values > >>>> from (0) to (100); postgres=# create table test1_2 partition of > >>>> test1 for values from (150) to (200); postgres=# create table > >>>> test1_def partition of test1 default; > >>>> > >>>> postgres=# explain select * from test1 where id > 0 and id < 30; > >>>>QUERY PLAN > >>>> > >>>> Append (cost=0.00..11.83 rows=59 width=11) > >>>>-> Seq Scan on test1_1 (cost=0.00..5.00 rows=58 width=11) > >>>> Filter: ((id > 0) AND (id < 30)) > >>>>-> Seq Scan on test1_def (cost=0.00..6.53 rows=1 width=12) > >>>> Filter: ((id > 0) AND (id < 30)) > >>>> (5 rows) > >>>> > >>>> There is no need to scan the default partition, but it's scanned. > >>>> - > >>>> > >>>> In the current implement, whether the default partition is scanned > >>>> or not is determined according to each condition of given WHERE > >>>> clause at get_matching_range_bounds(). In this example, > >>>> scan_default is set true according to id > 0 because id >= 200 > >>>> matches the default partition. Similarly, according to id < 30, > >> scan_default is set true. > >>>> Then, these results are combined according to AND/OR at > >> perform_pruning_combine_step(). > >>>> In this case, final result's scan_default is set true. > >>>> > >>>> The modifications I made are as follows: > >>>> - get_matching_range_bounds() determines only offsets of range bounds > >>>> according to each condition > >>>> - These results are combined at perform_pruning_combine_step() > >>>> - Whether the default partition is scanned or not is determined at > >>>> get_matching_partitions() > >>>> > >>>> Attached the patch. Any feedback is greatly appreciated. > >>> Thank you for reporting. Can you please add this to March CF in > >>> Bugs category so as not to lose > >> track > >>> of this? > >>> > >>> I will try to send review comments soon. > >>> > >> Thank you for your reply. I added this to March CF. > > I tested with simple use case and I confirmed it works correctly like below. > > > > In case using between clause: > > postgres=# create table test1(id int, val text) partition by range > > (id); postgres=# create table test1_1 partition of test1 for values > > from (0) to (100); postgres=# create table test1_2 partition of test1 > > for values from (150) to (200); postgres=# create table test1_def > > partition of test1 default; > > > > [HEAD] > > postgres=# explain analyze select * from test1 where id between 0 and 50; > > QUERY PLAN > > -- > > - > > Append (cost=0.00..58.16 rows=12 width=36) (actual time=0.008..0.008 > > rows=0 loops=1) > >-> Seq Scan on test1_1 (cost=0.00..29.05 rows=6 width=36) (actual > > time=0.005..0.005 > rows=0 loops=1) > > Filter: ((id >= 0) AND (id <= 50)) > >-> Seq Scan on test1_def (cost=0.00..29.05 rows=6 width=36) (actual > time=0.002..0.002 rows=0 loops=1) > > Filter: ((id >= 0) AND (id <= 50)) > > > > > > [patched] > > postgres=# explain analyze select * from test1 where id between 0 and 50; > >QUERY PLAN > > ---
RE: Problem with default partition pruning
Hi Amit-san, From: Amit Langote [mailto:langote_amit...@lab.ntt.co.jp] Sent: Monday, March 18, 2019 6:44 PM > Hosoya-san, > > On 2019/03/15 15:05, Yuzuko Hosoya wrote: > > Indeed, it's problematic. I also did test and I found that this > > problem was occurred when any partition didn't match WHERE clauses. > > So following query didn't work correctly. > > > > # explain select * from test1_3 where (id > 0 and id < 30); > >QUERY PLAN > > - > > Append (cost=0.00..58.16 rows=12 width=36) > >-> Seq Scan on test1_3_1 (cost=0.00..29.05 rows=6 width=36) > > Filter: ((id > 0) AND (id < 30)) > >-> Seq Scan on test1_3_2 (cost=0.00..29.05 rows=6 width=36) > > Filter: ((id > 0) AND (id < 30)) > > (5 rows) > > > > I created a new patch to handle this problem, and confirmed the query > > you mentioned works as expected > > > > # explain select * from test1 where (id > 0 and id < 30) or (id > 220 and > > id < 230); > > QUERY PLAN > > -- > > - Append (cost=0.00..70.93 rows=26 width=36) > >-> Seq Scan on test1_1_1 (cost=0.00..35.40 rows=13 width=36) > > Filter: (((id > 0) AND (id < 30)) OR ((id > 220) AND (id < 230))) > >-> Seq Scan on test1_3_1 (cost=0.00..35.40 rows=13 width=36) > > Filter: (((id > 0) AND (id < 30)) OR ((id > 220) AND (id < > > 230))) > > (5 rows) > > > > v2 patch attached. > > Could you please check it again? > > I think the updated patch breaks the promise that get_matching_range_bounds > won't set scan_default > based on individual pruning value comparisons. How about the attached delta > patch that applies on > top of your earlier v1 patch, which fixes the issue reported by Thibaut? > Indeed. I agreed with your proposal. Also, I confirmed your patch works correctly. Best regards, Yuzuko Hosoya
RE: Problem with default partition pruning
Hi, Thanks a lot for additional tests and the new patch. > Le 20/03/2019 à 10:06, Amit Langote a écrit : > > Hi Thibaut, > > > > On 2019/03/19 23:58, Thibaut Madelaine wrote: > >> I kept on testing with sub-partitioning. > > Thanks. > > > >> I found a case, using 2 default partitions, where a default partition > >> is not pruned: > >> > >> -- > >> > >> create table test2(id int, val text) partition by range (id); create > >> table test2_20_plus_def partition of test2 default; create table > >> test2_0_20 partition of test2 for values from (0) to (20) > >> partition by range (id); > >> create table test2_0_10 partition of test2_0_20 for values from (0) > >> to (10); create table test2_10_20_def partition of test2_0_20 > >> default; > >> > >> # explain (costs off) select * from test2 where id=5 or id=25; > >>QUERY PLAN > >> - > >> Append > >>-> Seq Scan on test2_0_10 > >> Filter: ((id = 5) OR (id = 25)) > >>-> Seq Scan on test2_10_20_def > >> Filter: ((id = 5) OR (id = 25)) > >>-> Seq Scan on test2_20_plus_def > >> Filter: ((id = 5) OR (id = 25)) > >> (7 rows) > >> > >> -- > >> > >> I have the same output using Amit's v1-delta.patch or Hosoya's > >> v2_default_partition_pruning.patch. > > I think I've figured what may be wrong. > > > > Partition pruning step generation code should ignore any arguments of > > an OR clause that won't be true for a sub-partitioned partition, given > > its partition constraint. > > > > In this case, id = 25 contradicts test2_0_20's partition constraint > > (which is, a IS NOT NULL AND a >= 0 AND a < 20), so the OR clause > > should really be simplified to id = 5, ignoring the id = 25 argument. > > Note that we remove id = 25 only for the considerations of pruning and > > not from the actual clause that's passed to the final plan, although > > it wouldn't be a bad idea to try to do that. > > > > Attached revised delta patch, which includes the fix described above. > > > > Thanks, > > Amit > Amit, I tested many cases with nested range sub-partitions... and I did not > find any problem with your > last patch :-) > > I tried mixing with hash partitions with no problems. > > From the patch, there seems to be less checks than before. I cannot think of > a case that can have > performance impacts. > > Hosoya-san, if you agree with Amit's proposal, do you think you can send a > patch unifying your > default_partition_pruning.patch and Amit's second v1-delta.patch? > I understood Amit's proposal. But I think the issue Thibaut reported would occur regardless of whether clauses have OR clauses or not as follows. I tested a query which should output "One-Time Filter: false". # explain select * from test2_0_20 where id = 25; QUERY PLAN ------- Append (cost=0.00..25.91 rows=6 width=36) -> Seq Scan on test2_10_20_def (cost=0.00..25.88 rows=6 width=36) Filter: (id = 25) As Amit described in the previous email, id = 25 contradicts test2_0_20's partition constraint, so I think this clause should be ignored and we can also handle this case in the similar way as Amit proposal. I attached v1-delta-2.patch which fix the above issue. What do you think about it? Best regards, Yuzuko Hosoya v1-delta-2.patch Description: Binary data
RE: Problem with default partition pruning
Hi, > > Hi, > > On 2019/03/23 2:36, Thibaut Madelaine wrote: > > I tested your last patch and if I didn't mix up patches on the end of > > a too long week, I get a problem when querying the sub-sub partition: > > > > test=# explain select * from test2_0_10 where id = 25; > > QUERY PLAN > > > > Seq Scan on test2_0_10 (cost=0.00..25.88 rows=6 width=36) > >Filter: (id = 25) > > (2 rows) > > The problem here is not really related to partition pruning, but another > problem I recently sent an > email about: > > https://www.postgresql.org/message-id/9813f079-f16b-61c8-9ab7-4363cab28d80%40lab.ntt.co.jp > > The problem in this case is that *constraint exclusion* is not working, > because partition constraint > is not loaded by the planner. Note that pruning is only used if a query > specifies the parent table, > not a partition. Thanks for the comments. I saw that email. Also, I checked that query Thibaut mentioned worked correctly with Amit's patch discussed in that thread. Best regards, Yuzuko Hosoya
RE: Problem with default partition pruning
Hi, > Maybe we should have two patches as we seem to be improving two things: > > 1. Patch to fix problems with default partition pruning originally reported > by Hosoya-san > > 2. Patch to determine if a given clause contradicts a sub-partitioned table's > partition constraint, > fixing problems unearthed by Thibaut's tests I attached the latest patches according to Amit comment. v3_default_partition_pruning.patch fixes default partition pruning problems and ignore_contradictory_where_clauses_at_partprune_step.patch fixes sub-partition problems Thibaut tested. Best regards, Yuzuko Hosoya ignore_contradictory_where_clauses_at_partprune_step.patch Description: Binary data v3_default_partition_pruning.patch Description: Binary data
RE: Problem with default partition pruning
Amit-san, Thanks for the comments. > > Thanks for dividing patches that way. > > Would it be a good idea to add some new test cases to these patches, just so > it's easily apparent what > we're changing? Yes, I agree with you. > > So, we could add the test case presented by Thibaut at the following link to > the > default_partition_pruning.patch: > > https://www.postgresql.org/message-id/a4968068-6401-7a9c-8bd4-6a3bc9164a86%40dalibo.com > > And, another reported at the following link to > ignore_contradictory_where_clauses_at_partprune_step.patch: > > https://www.postgresql.org/message-id/bd03f475-30d4-c4d0-3d7f-d2fbde755971%40dalibo.com > > Actually, it might be possible/better to construct the test queries in > partition_prune.sql using the > existing tables in that script, that is, without defining new tables just for > adding the new test cases. > If not, maybe it's OK to create the new tables too. > I see. I added some test cases to each patch according to tests discussed in this thread. However, I found another problem as follows. This query should output "One-Time Filter: false" because rlp3's constraints contradict WHERE clause. - postgres=# \d+ rlp3 Partitioned table "public.rlp3" Column | Type| Collation | Nullable | Default | Storage | Stats target | Description +---+---+--+-+--+--+- b | character varying | | | | extended | | a | integer | | | | plain| | Partition of: rlp FOR VALUES FROM (15) TO (20) Partition constraint: ((a IS NOT NULL) AND (a >= 15) AND (a < 20)) Partition key: LIST (b varchar_ops) Partitions: rlp3abcd FOR VALUES IN ('ab', 'cd'), rlp3efgh FOR VALUES IN ('ef', 'gh'), rlp3nullxy FOR VALUES IN (NULL, 'xy'), rlp3_default DEFAULT postgres=# explain select * from rlp3 where a = 2; QUERY PLAN Append (cost=0.00..103.62 rows=24 width=36) -> Seq Scan on rlp3abcd (cost=0.00..25.88 rows=6 width=36) Filter: (a = 2) -> Seq Scan on rlp3efgh (cost=0.00..25.88 rows=6 width=36) Filter: (a = 2) -> Seq Scan on rlp3nullxy (cost=0.00..25.88 rows=6 width=36) Filter: (a = 2) -> Seq Scan on rlp3_default (cost=0.00..25.88 rows=6 width=36) Filter: (a = 2) (9 rows) - I think that the place of check contradiction process was wrong At ignore_contradictory_where_clauses_at_partprune_step.patch. So I fixed it. Attached the latest patches. Please check it again. Best regards, Yuzuko Hosoya v2_ignore_contradictory_where_clauses_at_partprune_step.patch Description: Binary data v4_default_partition_pruning.patch Description: Binary data
RE: Problem with default partition pruning
Amit-san, > -Original Message- > From: Amit Langote [mailto:langote_amit...@lab.ntt.co.jp] > Sent: Friday, April 05, 2019 6:47 PM > To: Yuzuko Hosoya ; 'Thibaut' > ; 'Imai, > Yoshikazu' > Cc: 'PostgreSQL Hackers' > Subject: Re: Problem with default partition pruning > > Hosoya-san, > > > On 2019/04/04 13:00, Yuzuko Hosoya wrote: > > I added some test cases to each patch according to tests discussed in > > this thread. > > Thanks a lot. > > > However, I found another problem as follows. This query should output > > "One-Time Filter: false" because rlp3's constraints contradict WHERE > > clause. > > > > - > > postgres=# \d+ rlp3 > >Partitioned table "public.rlp3" > > Column | Type| Collation | Nullable | Default | Storage | > > Stats target | Description > > > +---+---+--+-+--+--+- > > > b | character varying | | | | extended | > >| > > a | integer | | | | plain| > >| > > Partition of: rlp FOR VALUES FROM (15) TO (20) Partition constraint: > > ((a IS NOT NULL) AND (a >= 15) AND (a < 20)) Partition key: LIST (b > > varchar_ops) > > Partitions: rlp3abcd FOR VALUES IN ('ab', 'cd'), > > rlp3efgh FOR VALUES IN ('ef', 'gh'), > > rlp3nullxy FOR VALUES IN (NULL, 'xy'), > > rlp3_default DEFAULT > > > > postgres=# explain select * from rlp3 where a = 2; > > QUERY PLAN > > > > Append (cost=0.00..103.62 rows=24 width=36) > >-> Seq Scan on rlp3abcd (cost=0.00..25.88 rows=6 width=36) > > Filter: (a = 2) > >-> Seq Scan on rlp3efgh (cost=0.00..25.88 rows=6 width=36) > > Filter: (a = 2) > >-> Seq Scan on rlp3nullxy (cost=0.00..25.88 rows=6 width=36) > > Filter: (a = 2) > >-> Seq Scan on rlp3_default (cost=0.00..25.88 rows=6 width=36) > > Filter: (a = 2) > > (9 rows) > > - > > This one too would be solved with the other patch I mentioned to fix > get_relation_info() to load the partition constraint so that constraint > exclusion can use it. > Partition in the earlier example given by Thibaut is a leaf partition, > whereas rlp3 above is a > sub-partitioned partition, but both are partitions nonetheless. > > Fixing partprune.c like we're doing with the > v2_ignore_contradictory_where_clauses_at_partprune_step.patch only works for > the latter, because only > partitioned tables visit partprune.c. > > OTOH, the other patch only applies to situations where constraint_exclusion = > on. > I see. I think that following example discussed in this thread before would also be solved with your patch, not v2_ignore_contradictory_where_clauses_at_partprune_step.patch. postgres=# set constraint_exclusion to on; postgres=# explain select * from test2_0_20 where id = 25; QUERY PLAN -- Result (cost=0.00..0.00 rows=0 width=0) One-Time Filter: false (2 rows) > > I think that the place of check contradiction process was wrong At > > ignore_contradictory_where_clauses_at_partprune_step.patch. > > So I fixed it. > > Thanks. Patch contains some whitespace noise: > > $ git diff --check > src/backend/partitioning/partprune.c:790: trailing whitespace. > + * given its partition constraint, we can ignore it, > src/backend/partitioning/partprune.c:791: trailing whitespace. > + * that is not try to pass it to the pruning code. > src/backend/partitioning/partprune.c:792: trailing whitespace. > + * We should do that especially to avoid pruning code > src/backend/partitioning/partprune.c:810: trailing whitespace. > + > src/test/regress/sql/partition_prune.sql:87: trailing whitespace. > +-- where clause contradicts sub-partition's constraint > > Can you please fix it? > Thanks for checking. I'm attaching the latest patch. > > BTW, now I'm a bit puzzled between whether this case should be fixed by > hacking on partprune.c like > this patch does or whether to work on getting the other patch committed and > expect users to set > constraint_exclusion = on for this to behave as expected. The original > intention of setting > parti
RE: Problem with default partition pruning
Horiguchi-san, Thanks for your comments. > -Original Message- > From: Kyotaro HORIGUCHI [mailto:horiguchi.kyot...@lab.ntt.co.jp] > Sent: Tuesday, April 09, 2019 10:33 AM > To: hosoya.yuz...@lab.ntt.co.jp > Cc: langote_amit...@lab.ntt.co.jp; thibaut.madela...@dalibo.com; > imai.yoshik...@jp.fujitsu.com; > pgsql-hackers@lists.postgresql.org > Subject: Re: Problem with default partition pruning > > Sigh.. > > At Tue, 09 Apr 2019 10:28:48 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI > wrote in > <20190409.102848.252476604.horiguchi.kyot...@lab.ntt.co.jp> > > As the second thought. Partition constraint is not constraint > > expression so that's fair to apply partqual ignoring > > constraint_exclusion. The variable is set false to skip useless > > expression evaluation on all partitions, but partqual should be > > evaluated just once. Sorry for my confusion. > > > > So still it is wrong that the new code is added in > > gen_partprune_steps_internal. > > So still it is wrong that the new code is added at the beginning of the loop > on clauses in > gen_partprune_steps_internal. > > > If partqual results true and the clause > > is long, the partqual is evaluated uselessly at every recursion. > > > > Maybe we should do that when we find that the current clause doesn't > > match part attributes. Specifically just after the for loop "for (i = > > 0 ; i < part_scheme->partnattrs; i++)". > I think we should check whether WHERE clause contradicts partition constraint even when the clause matches part attributes. So I moved "if (partqual)" block to the beginning of the loop you mentioned. I'm attaching the latest version. Could you please check it again? Best regards, Yuzuko Hosoya v4_ignore_contradictory_where_clauses_at_partprune_step.patch Description: Binary data
Re: Problem with default partition pruning
Horiguchi-san, > -Original Message- > From: Kyotaro HORIGUCHI [mailto:horiguchi.kyot...@lab.ntt.co.jp] > Sent: Tuesday, April 09, 2019 5:37 PM > To: hosoya.yuz...@lab.ntt.co.jp > Cc: langote_amit...@lab.ntt.co.jp; thibaut.madela...@dalibo.com; > imai.yoshik...@jp.fujitsu.com; pgsql-hackers@lists.postgresql.org > Subject: Re: Problem with default partition pruning > > Hi. > > At Tue, 9 Apr 2019 16:41:47 +0900, "Yuzuko Hosoya" > wrote in > <00cf01d4eea7$afa43370$0eec9a50$@lab.ntt.co.jp> > > > So still it is wrong that the new code is added at the beginning > > > of the loop on clauses in gen_partprune_steps_internal. > > > > > > > If partqual results true and the > > > > clause is long, the partqual is evaluated uselessly at every recursion. > > > > > > > > Maybe we should do that when we find that the current clause > > > > doesn't match part attributes. Specifically just after the for > > > > loop "for (i = > > > > 0 ; i < part_scheme->partnattrs; i++)". > > > > > I think we should check whether WHERE clause contradicts partition > > constraint even when the clause matches part attributes. So I moved > > Why? If clauses contains a clause on a partition key, the clause is > involved in determination of whether a partition survives or not in > ordinary way. Could you show how or on what configuration (tables and > query) it happens that such a matching clause needs to be checked against > partqual? > We found that partition pruning didn't work as expect when we scanned a sub-partition using WHERE clause which contradicts the sub-partition's constraint by Thibaut tests. The example discussed in this thread as follows. postgres=# \d+ test2 Partitioned table "public.test2" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description +-+---+--+-+--+--+- id | integer | | | | plain| | val| text| | | | extended | | Partition key: RANGE (id) Partitions: test2_0_20 FOR VALUES FROM (0) TO (20), PARTITIONED, test2_20_plus_def DEFAULT postgres=# \d+ test2_0_20 Partitioned table "public.test2_0_20" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description +-+---+--+-+--+--+- id | integer | | | | plain| | val| text| | | | extended | | Partition of: test2 FOR VALUES FROM (0) TO (20) Partition constraint: ((id IS NOT NULL) AND (id >= 0) AND (id < 20)) Partition key: RANGE (id) Partitions: test2_0_10 FOR VALUES FROM (0) TO (10), test2_10_20_def DEFAULT postgres=# explain (costs off) select * from test2 where id=5 or id=20; QUERY PLAN - Append -> Seq Scan on test2_0_10 Filter: ((id = 5) OR (id = 20)) -> Seq Scan on test2_10_20_def Filter: ((id = 5) OR (id = 20)) -> Seq Scan on test2_20_plus_def Filter: ((id = 5) OR (id = 20)) (7 rows) postgres=# explain (costs off) select * from test2_0_20 where id=25; QUERY PLAN - Seq Scan on test2_10_20_def Filter: (id = 25) (2 rows) So I think we have to check if WHERE clause contradicts sub-partition's constraint regardless of whether the clause matches part attributes or not. > The "if (partconstr)" block uselessly runs for every clause in the clause > tree other than BoolExpr. > If we want do that, isn't just doing predicate_refuted_by(partconstr, > clauses, false) sufficient before looping over clauses? Yes, I tried doing that in the original patch. > > > > "if (partqual)" block to the beginning of the loop you mentioned. > > > > I'm attaching the latest version. Could you please check it again? > > regards. > > -- > Kyotaro Horiguchi > NTT Open Source Software Center Best regards, Yuzuko Hosoya
Runtime pruning problem
Hi all, I found a runtime pruning test case which may be a problem as follows: create table t1 (id int, dt date) partition by range(dt); create table t1_1 partition of t1 for values from ('2019-01-01') to ('2019-04-01'); create table t1_2 partition of t1 for values from ('2019-04-01') to ('2019-07-01'); create table t1_3 partition of t1 for values from ('2019-07-01') to ('2019-10-01'); create table t1_4 partition of t1 for values from ('2019-10-01') to ('2020-01-01'); In this example, current_date is 2019-04-16. postgres=# explain select * from t1 where dt = current_date + 400; QUERY PLAN Append (cost=0.00..198.42 rows=44 width=8) Subplans Removed: 3 -> Seq Scan on t1_1 (cost=0.00..49.55 rows=11 width=8) Filter: (dt = (CURRENT_DATE + 400)) (4 rows) postgres=# explain analyze select * from t1 where dt = current_date + 400; QUERY PLAN --- Append (cost=0.00..198.42 rows=44 width=8) (actual time=0.000..0.001 rows=0 loops=1) Subplans Removed: 3 -> Seq Scan on t1_1 (cost=0.00..49.55 rows=11 width=8) (never executed) Filter: (dt = (CURRENT_DATE + 400)) Planning Time: 0.400 ms Execution Time: 0.070 ms (6 rows) I realized t1_1 was not scanned actually since "never executed" was displayed in the plan using EXPLAIN ANALYZE. But I think "One-Time Filter: false" and "Subplans Removed: ALL" or something like that should be displayed instead. What do you think? Best regards, Yuzuko Hosoya NTT Open Source Software Center
A typo in partprune.c
Hi, I found a typo in the comment of get_matching_range_bounds. /* - * get_matching_range_datums + * get_matching_range_bounds * Determine the offsets of range bounds matching the specified values, * according to the semantics of the given operator strategy Here is the patch to fix it. Best regards, Yuzuko Hosoya NTT Open Source Software Center fix_partprune_typo.patch Description: Binary data
RE: A typo in partprune.c
Hi Michael, > From: Michael Paquier [mailto:mich...@paquier.xyz] > Sent: Thursday, November 08, 2018 8:17 PM > > On Thu, Nov 08, 2018 at 07:19:14PM +0900, Yuzuko Hosoya wrote: > > Here is the patch to fix it. > > Thanks, committed. Thank you. Yuzuko Hosoya NTT Open Source Software Center
Problem with default partition pruning
Hi, I found the bug of default partition pruning when executing a range query. - postgres=# create table test1(id int, val text) partition by range (id); postgres=# create table test1_1 partition of test1 for values from (0) to (100); postgres=# create table test1_2 partition of test1 for values from (150) to (200); postgres=# create table test1_def partition of test1 default; postgres=# explain select * from test1 where id > 0 and id < 30; QUERY PLAN Append (cost=0.00..11.83 rows=59 width=11) -> Seq Scan on test1_1 (cost=0.00..5.00 rows=58 width=11) Filter: ((id > 0) AND (id < 30)) -> Seq Scan on test1_def (cost=0.00..6.53 rows=1 width=12) Filter: ((id > 0) AND (id < 30)) (5 rows) There is no need to scan the default partition, but it's scanned. - In the current implement, whether the default partition is scanned or not is determined according to each condition of given WHERE clause at get_matching_range_bounds(). In this example, scan_default is set true according to id > 0 because id >= 200 matches the default partition. Similarly, according to id < 30, scan_default is set true. Then, these results are combined according to AND/OR at perform_pruning_combine_step(). In this case, final result's scan_default is set true. The modifications I made are as follows: - get_matching_range_bounds() determines only offsets of range bounds according to each condition - These results are combined at perform_pruning_combine_step() - Whether the default partition is scanned or not is determined at get_matching_partitions() Attached the patch. Any feedback is greatly appreciated. Best regards, --- Yuzuko Hosoya NTT Open Source Software Center default_partition_pruning.patch Description: Binary data
RE: Problem with default partition pruning
Amit-san, > From: Amit Langote [mailto:langote_amit...@lab.ntt.co.jp] > Sent: Wednesday, February 27, 2019 11:22 AM > > Hosoya-san, > > On 2019/02/22 17:14, Yuzuko Hosoya wrote: > > Hi, > > > > I found the bug of default partition pruning when executing a range query. > > > > - > > postgres=# create table test1(id int, val text) partition by range > > (id); postgres=# create table test1_1 partition of test1 for values > > from (0) to (100); postgres=# create table test1_2 partition of test1 > > for values from (150) to (200); postgres=# create table test1_def > > partition of test1 default; > > > > postgres=# explain select * from test1 where id > 0 and id < 30; > >QUERY PLAN > > > > Append (cost=0.00..11.83 rows=59 width=11) > >-> Seq Scan on test1_1 (cost=0.00..5.00 rows=58 width=11) > > Filter: ((id > 0) AND (id < 30)) > >-> Seq Scan on test1_def (cost=0.00..6.53 rows=1 width=12) > > Filter: ((id > 0) AND (id < 30)) > > (5 rows) > > > > There is no need to scan the default partition, but it's scanned. > > - > > > > In the current implement, whether the default partition is scanned or > > not is determined according to each condition of given WHERE clause at > > get_matching_range_bounds(). In this example, scan_default is set > > true according to id > 0 because id >= 200 matches the default > > partition. Similarly, according to id < 30, scan_default is set true. > > Then, these results are combined according to AND/OR at > > perform_pruning_combine_step(). > > In this case, final result's scan_default is set true. > > > > The modifications I made are as follows: > > - get_matching_range_bounds() determines only offsets of range bounds > > according to each condition > > - These results are combined at perform_pruning_combine_step() > > - Whether the default partition is scanned or not is determined at > > get_matching_partitions() > > > > Attached the patch. Any feedback is greatly appreciated. > > Thank you for reporting. Can you please add this to March CF in Bugs > category so as not to lose track > of this? > > I will try to send review comments soon. > Thank you for your reply. I added this to March CF. Regards, Yuzuko Hosoya NTT Open Source Software Center
Improve selectivity estimate for range queries
Hi, I found the problem about selectivity estimate for range queries on master as follows. This is my test case: postgres=# CREATE TABLE test(id int, val text); CREATE TABLE postgres=# INSERT INTO test SELECT i, 'testval' FROM generate_series(0,2)i; INSERT 0 3 postgres=# VACUUM analyze test; VACUUM postgres=# EXPLAIN ANALYZE SELECT * FROM test WHERE id > 0 and id < 1; QUERY PLAN --- Seq Scan on test (cost=0.00..613.00 rows=150 width=12) (actual time=0.044..21.371 rows= loops=1) Filter: ((id > 0) AND (id < 1)) Rows Removed by Filter: 20001 Planning Time: 0.099 ms Execution Time: 37.142 ms (5 rows) Although the actual number of rows was , the estimated number of rows was 150. So I dug to see what happened and thought that the following part in clauselist_selectivity caused this problem. /* * Now scan the rangequery pair list. */ while (rqlist != NULL) { RangeQueryClause *rqnext; if (rqlist->have_lobound && rqlist->have_hibound) { /* Successfully matched a pair of range clauses */ Selectivity s2; /* * Exact equality to the default value probably means the * selectivity function punted. This is not airtight but should * be good enough. */ if (rqlist->hibound == DEFAULT_INEQ_SEL || rqlist->lobound == DEFAULT_INEQ_SEL) { s2 = DEFAULT_RANGE_INEQ_SEL; } else { s2 = rqlist->hibound + rqlist->lobound - 1.0; In my environment, the selectivity for id > 0 was 0.1, and the selectivity for id < 1 was 0.1. Then, the value of rqlist->hibound and rqlist->lobound are set respectively. On the other hand, DEFAULT_INEQ_SEL is 0.. As a result, the final selectivity is computed according to DEFAULT_RANGE_INEQ_SEL, since the condition of the second if statement was satisfied. However, these selectivities were computed according to the statistics, so the final selectivity had to be calculated from rqlist->hibound + rqlist->lobound - 1.0. My test case might be uncommon, but I think it would cause some problems. To handle such cases I've thought up of an idea based on a previous commit[1] which solved a similar problem related to DEFAULT_NUM_DISTINCT. Just like this modification, I added a flag which shows whether the selectivity was calculated according to the statistics or not to clauselist_selectivity, and used it as a condition of the if statement instead of rqlist->hibound == DEFAULT_INEQ_SEL and rqlist->lobound == DEFAULT_INEQ_SEL. I am afraid that changes were more than I had expected, but we can estimate selectivity correctly. Patch attached. Do you have any thoughts? [1] https://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=4c2777d0b733220d9029f78817af8ce Best regards, Yuzuko Hosoya NTT Open Source Software Center improve_selectivity_estimate_for_range_queries.patch Description: Binary data
RE: Improve selectivity estimate for range queries
Hi, Thanks for the comments. I attach the v2 patch. > From: Kyotaro HORIGUCHI [mailto:horiguchi.kyot...@lab.ntt.co.jp] > Sent: Friday, December 21, 2018 12:25 PM > > Hello. > > At Thu, 20 Dec 2018 17:21:29 +0900, "Yuzuko Hosoya" > wrote in > <008701d4983d$02e731c0$08b59540$@lab.ntt.co.jp> > > In my environment, the selectivity for id > 0 was 0.1, > > and the selectivity for id < 1 was 0.1. Then, the > > value of rqlist->hibound and rqlist->lobound are set respectively. > > On the other hand, DEFAULT_INEQ_SEL is 0.. As a > > result, the final selectivity is computed according to > > DEFAULT_RANGE_INEQ_SEL, since the condition of the second if statement > > was satisfied. However, these selectivities were computed according to > > the statistics, so the final selectivity had to be calculated from > > rqlist->hibound + rqlist->lobound > - 1.0. > > My test case might be uncommon, but I think it would cause some problems. > > Yeah, its known behavior as the comment just above. If in your example query, > the table weer very large > and had an index it could run faultly an index scan over a fair amount of > tuples. But I'm not sure > how much it matters and your example doesn't show that. > > The behvavior discussed here is introduced by this commit. > > | commit 547bb4a7f2bdccad9253a99211ce84b3f9de485a > | Author: Tom Lane > | Date: Tue Nov 9 00:34:46 2004 + > | > | Use a hopefully-more-reliable method of detecting default selectivity > | estimates when combining the estimates for a range query. As pointed > out > | by Miquel van Smoorenburg, the existing check for an impossible combined > | result would quite possibly fail to detect one default and one > non-default > | input. It seems better to use the default range query estimate in such > | cases. To do so, add a check for an estimate of exactly > DEFAULT_INEQ_SEL. > | This is a bit ugly because it introduces additional coupling between > | clauselist_selectivity and scalarltsel/scalargtsel, but it's not like > | there wasn't plenty already... > > > To handle such cases I've thought up of an idea based on a previous > > commit[1] which solved a similar problem related to > > DEFAULT_NUM_DISTINCT. Just like this modification, I added a flag > > which shows whether the selectivity > > The commit [1] added almost no complexity, but this does. Affects many > functions to introduce tighter > coupling between operator-selectivity functions and clause selectivity > functions. > isdefault travells alone too long just to bind remote functions. We might > need more pricipled way to > handle that. > Yes, indeed. But I have not found better way than this approach yet. > > was calculated according to the statistics or not to > > clauselist_selectivity, and used it as a condition of the if statement > > instead of > > rqlist->hibound == DEFAULT_INEQ_SEL and rqlist->lobound == DEFAULT_INEQ_SEL. > > I am afraid that changes were more than I had expected, but we can > > estimate selectivity correctly. > > > > Patch attached. Do you have any thoughts? > > I've just run over the patch, but some have comments. > > + if (isdefault) > + rqelem->lobound_isdefault = true; > > This is taking the disjunction of lobounds ever added, I suppose it should be > the last lobounds' > isdefault. > Indeed. I fixed it. > As you see, four other instances of "selec ==/!= DEFAULT_*" exist in > mergejoinsel. Don't they lead > to similar kind of problem? > I didn't encounter similar problems, but as you said, such problems would be occurred due to the condition, selec != DEFAULT_INEQ_SEL. I changed these conditions into "!isdefault". > > > Selectivity lobound;/* Selectivity of a var > something clause */ > Selectivity hibound;/* Selectivity of a var < something clause */ > +boollobound_isdefault;/* lobound is a default selectivity? */ > +boolhibound_isdefault;/* hibound is a default selectivity? */ > } RangeQueryClause; > > Around the [1], there was a discussion about introducing the notion of > uncertainty to selectivity. > The isdefualt is a kind of uncertainty value indicating '0/100% uncertain'. > So my feeling is saying > that it's better that Selectivity has certinty component then building a > selectivity arithmetics > involving uncertainty, though I don't have a cle
RE: Improve selectivity estimate for range queries
Hi, Thanks for the comments, and I'm sorry for the late reply. > From: Tom Lane [mailto:t...@sss.pgh.pa.us] > Sent: Friday, January 11, 2019 7:04 AM > > Robert Haas writes: > > On Fri, Dec 21, 2018 at 11:50 AM Tom Lane wrote: > >> A smaller-footprint way to fix the immediate problem might be to > >> change the values of DEFAULT_INEQ_SEL and friends so that they're > >> even less likely to be matched by accident. That is, instead of > >> 0., use 0.342 or some such. > > > That's not a dumb idea, but it seems pretty unprincipled to me, and to > > be honest I'm kind of surprised that you're not proposing something > > cleaner. > > The problem is the invasiveness of such a change (large) vs the benefit (not > so large). The upthread > patch attempted to add a separate signaling path, but it was very incomplete > --- and yet both I and > Horiguchi-san thought it was already too messy. > > Maybe at some point we'll go over to something reasonably principled, like > adding confidence intervals > to all selectivity estimates. That would be *really* invasive but perhaps > would bring enough benefit > to justify itself. But the current patch is just attempting to fix one > extremely narrow pain point, > and if that is all it's doing it should have a commensurately small > footprint. So I don't think the > submitted patch looks good from a cost/benefit standpoint. > Yes, I agree with you. Indeed the patch I attached is insufficient in cost-effectiveness. However, I want to solve problems of that real estimates happened to equal to the default values such as this case, even though it's a narrow pain point. So I tried distinguishing explicitly between real estimates and otherwise as Robert said. The idea Tom proposed and Horiguchi-san tried seems reasonable, but I'm concerned whether any range queries really cannot match 0.342 (or some such) by accident in any environments. Is the way which Horiguchi-san did enough to prove that? Best regards, Yuzuko Hosoya NTT Open Source Software Center
Proposal: Partitioning Advisor for PostgreSQL
Hello, I'm Yuzuko Hosoya. This is my first PostgreSQL project participation. I have been developing partitioning advisor prototype with Julien Rouhaud. It will be a new feature of HypoPG[1], which is a PostgreSQL extension, and will help partitioning design tuning. Currently, HypoPG only supports index design tuning; it allows users to define hypothetical indexes for real tables and shows resulting queries' plan/cost with EXPLAIN as if they were actually constructed. Since declarative partitioning will be greatly improved in PostgreSQL 11 and further versions, there are emerging needs to support partitioning design tuning. This is why we are working on partitioning advisor. We plan to release the first version of partitioning advisor for PostgreSQL 11, and then, improve it for PostgreSQL 12. Overview of partitioning advisor --- - Partitioning advisor allows users to define multiple hypothetical partitioning schemes on real tables and real data - PostgreSQL can show resulting queries' plan/cost with EXPLAIN using hypothetical partitioning schemes Users can quickly check how their queries would behave if some tables were partitioned, and try different partitioning schemes (for instance, to optimize some queries efficiency Vs. maintenance efficiency). Partitioning advisor works as follows: Usage - 0. Consider this target table, t1 #= CREATE TABLE t1 (a int, b text); #= INSERT INTO t1 SELECT i, 'test' FROM generate_series(1,299) i ; #= EXPLAIN SELECT * FROM t1; QUERY PLAN - Seq Scan on t1 (cost=0.00..4.99 rows=299 width=9) (1 row) 1. Partition the target table hypothetically #= SELECT * FROM hypopg_partition_table('t1','partition by range(a)'); The hypopg_partition_table() defines hypothetical range partitioned table 't1' by the partition key 'a' and stores these information into backend local memory. 2. Create hypothetical partitions #= SELECT * FROM hypopg_add_partition('t1_1','partition of t1 for values from (1) to (100)'); #= SELECT * FROM hypopg_add_partition('t1_2','partition of t1 for values from (100) to (300)'); The hypopg_add_partition() defines hypothetical partitions t1_1 and t1_2 according to their bounds 'from (1) to (100)' and 'from (100) to (300)' respectively, and stores these information into backend local memory. 3. PostgreSQL can show resulting queries' plan/cost with EXPLAIN #= EXPLAIN SELECT * FROM t1; QUERY PLAN --- Append (cost=0.00..7.49 rows=299 width=9) -> Seq Scan on t1 t1_1 (cost=0.00..1.99 rows=99 width=9) -> Seq Scan on t1 t1_2 (cost=0.00..4.00 rows=200 width=9) (3 rows) PostgreSQL retrieves hypothetical partitioning schemes from HypoPG. And then if the referred table is defined as hypothetical partitioned table, PostgreSQL creates plans using them. This is a simple example. In addition, it enables us to simulate range/list/hash partitioning, partition pruning, N-way join and partition-wise join/aggregation. It is already helpful for users to design partitioning schemes. Current implementation We mainly use get_relation_info_hook(). What we do in this hook is to inject hypothetical partitioning schemes according to user definition. At first, we do all processes that are done at expand_inherited_tables(). Specifically, we expand root->simple_rte_array and root->parse->rtable, rewrite target table's RangeTblEntry as a partitioned table, and create RangeTblEntries and AppendRelInfos for all hypothetical partitions. Besides that, we set hypothetical partition's name into rte->alias->aliasname at this time to display hypothetical partition's name with EXPLAIN. And then, we rewrite RelOptInfo as needed. Specifically, we add partition information, which is set at set_relation_partition_info(), to hypothetical partitioned tables, and set rel->tuples and rel->pages for hypothetical partitions. However, PostgreSQL isn't designed to have hypothetical tables, so we have some problematic blockers for an implementation as follows. We'd like to discuss these topics. Topics of discussion - - Expanding partition's RTE We have to do all processes which are done at expand_inherited_tables() for hypothetical partitions. But, since there are no hooks around here, we use get_relation_info_hook() as I mentioned above. In this case, we cannot simulate update queries correctly, because inheritance_planner() which handles update queries is called before get_relation_info_hook(). Therefore, we'd like to see if we could add a hook at ex