Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-03-25 Thread Andy Fan
I have started the new thread [1] to continue talking about this. Mr. cfbot is happy now. [1] https://www.postgresql.org/message-id/flat/CAKU4AWrwZMAL%3DuaFUDMf4WGOVkEL3ONbatqju9nSXTUucpp_pw%40mail.gmail.com Thanks >

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-03-17 Thread Andy Fan
Hi David: On Wed, Mar 18, 2020 at 12:13 PM David Rowley wrote: > On Wed, 18 Mar 2020 at 15:57, Andy Fan wrote: > > I'm now writing the code for partition index stuff, which > > is a bit of boring, since every partition may have different unique > index. > > Why is that case so different? > > Fo

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-03-17 Thread David Rowley
On Wed, 18 Mar 2020 at 15:57, Andy Fan wrote: > I'm now writing the code for partition index stuff, which > is a bit of boring, since every partition may have different unique index. Why is that case so different? For a partitioned table to have a valid unique index, a unique index must exist on

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-03-17 Thread Andy Fan
Hi David: Thanks for your time. On Wed, Mar 18, 2020 at 9:56 AM David Rowley wrote: > On Mon, 16 Mar 2020 at 06:01, Andy Fan wrote: > > > > Hi All: > > > > I have re-implemented the patch based on David's suggestion/code, Looks > it > > works well. The updated patch mainly includes: > > > >

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-03-17 Thread David Rowley
On Mon, 16 Mar 2020 at 06:01, Andy Fan wrote: > > Hi All: > > I have re-implemented the patch based on David's suggestion/code, Looks it > works well. The updated patch mainly includes: > > 1. Maintain the not_null_colno in RelOptInfo, which includes the not null from > catalog and the not

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-03-15 Thread Andy Fan
Hi All: I have re-implemented the patch based on David's suggestion/code, Looks it works well. The updated patch mainly includes: 1. Maintain the not_null_colno in RelOptInfo, which includes the not null from catalog and the not null from vars. 2. Add the restictinfo check at populate_base

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-03-12 Thread Andy Fan
On Fri, Mar 13, 2020 at 11:46 AM David Rowley wrote: > On Fri, 13 Mar 2020 at 14:47, Andy Fan wrote: > > 1. for pupulate_baserel_uniquekeys, we need handle the "pk = Const" > as well. > > (relation_has_unqiue_for has a similar logic) currently the following > distinct path is still > > there

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-03-12 Thread David Rowley
On Fri, 13 Mar 2020 at 14:47, Andy Fan wrote: > 1. for pupulate_baserel_uniquekeys, we need handle the "pk = Const" as > well. > (relation_has_unqiue_for has a similar logic) currently the following > distinct path is still > there. Yeah, I left a comment in propagate_unique_keys_to_joinrel

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-03-12 Thread Andy Fan
Hi David: On Thu, Mar 12, 2020 at 3:51 PM David Rowley wrote: > On Wed, 11 Mar 2020 at 17:23, Andy Fan wrote: > > Now I am convinced that we should maintain UniquePath on RelOptInfo, > > I would see how to work with "Index Skip Scan" patch. > > I've attached a very early proof of concept patch

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-03-12 Thread David Rowley
On Wed, 11 Mar 2020 at 17:23, Andy Fan wrote: > Now I am convinced that we should maintain UniquePath on RelOptInfo, > I would see how to work with "Index Skip Scan" patch. I've attached a very early proof of concept patch for unique keys. The NULL detection stuff is not yet hooked up, so it'll c

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-03-11 Thread Ashutosh Bapat
On Wed, Mar 11, 2020 at 4:19 AM David Rowley wrote: > > On Wed, 11 Mar 2020 at 02:50, Ashutosh Bapat > wrote: > > > > On Tue, Mar 10, 2020 at 1:49 PM Andy Fan wrote: > > > In my current implementation, it calculates the uniqueness for each > > > BaseRel only, but in your way, looks we need to c

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-03-11 Thread Ashutosh Bapat
On Tue, Mar 10, 2020 at 9:12 PM Andy Fan wrote: > > > Hi Tom & David & Bapat: > > Thanks for your review so far. I want to summarize the current issues to help > our following discussion. > > 1. Shall we bypass the AggNode as well with the same logic. > > I think yes, since the rules to bypass a

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-03-10 Thread Andy Fan
On Wed, Mar 11, 2020 at 6:49 AM David Rowley wrote: > On Wed, 11 Mar 2020 at 02:50, Ashutosh Bapat > wrote: > > > > On Tue, Mar 10, 2020 at 1:49 PM Andy Fan > wrote: > > > In my current implementation, it calculates the uniqueness for each > > > BaseRel only, but in your way, looks we need to

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-03-10 Thread David Rowley
On Tue, 10 Mar 2020 at 21:19, Andy Fan wrote: > >> SELECT DISTINCT max(non_unique) FROM t1; to skip doing the DISTINCT part. > > > Actually I want to keep the distinct for this case now. One reason is there > are only 1 > row returned, so the distinct erasing can't help much. The more importan

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-03-10 Thread David Rowley
On Wed, 11 Mar 2020 at 02:50, Ashutosh Bapat wrote: > > On Tue, Mar 10, 2020 at 1:49 PM Andy Fan wrote: > > In my current implementation, it calculates the uniqueness for each > > BaseRel only, but in your way, looks we need to calculate the > > UniquePathKey for both BaseRel and JoinRel. This

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-03-10 Thread Andy Fan
Hi Tom & David & Bapat: Thanks for your review so far. I want to summarize the current issues to help our following discussion. 1. Shall we bypass the AggNode as well with the same logic. I think yes, since the rules to bypass a AggNode and UniqueNode is exactly same. The difficulty of bypassin

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-03-10 Thread Ashutosh Bapat
Hi Andy, On Tue, Mar 10, 2020 at 1:49 PM Andy Fan wrote: > > Hi David: > >> >> 3. Perhaps in add_paths_to_joinrel(), or maybe when creating the join >> rel itself (I've not looked for the best location in detail), >> determine if the join can cause rows to be duplicated. If it can't, >> then add

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-03-10 Thread Ashutosh Bapat
On Tue, Mar 10, 2020 at 3:51 AM David Rowley wrote: > On Sat, 7 Mar 2020 at 00:47, Andy Fan wrote: > > Upload the newest patch so that the cfbot can pass. The last patch > failed > > because some explain without the (cost off). > > I've only really glanced at this patch, but I think we need to

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-03-10 Thread Andy Fan
Hi David: > 3. Perhaps in add_paths_to_joinrel(), or maybe when creating the join > rel itself (I've not looked for the best location in detail), > determine if the join can cause rows to be duplicated. If it can't, > then add the UniqueKeys from that rel. I have some concerns about this method

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-03-09 Thread David Rowley
On Sat, 7 Mar 2020 at 00:47, Andy Fan wrote: > Upload the newest patch so that the cfbot can pass. The last patch failed > because some explain without the (cost off). I've only really glanced at this patch, but I think we need to do this in a completely different way. I've been mentioning Uniq

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-03-06 Thread Andy Fan
Upload the newest patch so that the cfbot can pass. The last patch failed because some explain without the (cost off). I'm still on the way to figure out how to handle aggregation calls without aggregation path. Probably we can get there by hacking some ExprEvalPushStep for Aggref node. But sin

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-03-04 Thread Andy Fan
> > >> * There are some changes in existing regression cases that aren't >> visibly related to the stated purpose of the patch, eg it now >> notices that "select distinct max(unique2) from tenk1" doesn't >> require an explicit DISTINCT step. That's not wrong, but I wonder >> if maybe you should su

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-03-02 Thread Andy Fan
On Tue, Mar 3, 2020 at 1:24 AM Andy Fan wrote: > Thank you Tom for the review! > > On Mon, Mar 2, 2020 at 4:46 AM Tom Lane wrote: > >> Andy Fan writes: >> > Please see if you have any comments. Thanks >> >> The cfbot isn't at all happy with this. Its linux build is complaining >> about a pos

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-03-02 Thread Andy Fan
Thank you Tom for the review! On Mon, Mar 2, 2020 at 4:46 AM Tom Lane wrote: > Andy Fan writes: > > Please see if you have any comments. Thanks > > The cfbot isn't at all happy with this. Its linux build is complaining > about a possibly-uninitialized variable, and then giving up: > > https:

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-03-01 Thread Tom Lane
Andy Fan writes: > Please see if you have any comments. Thanks The cfbot isn't at all happy with this. Its linux build is complaining about a possibly-uninitialized variable, and then giving up: https://travis-ci.org/postgresql-cfbot/postgresql/builds/656722993 The Windows build isn't using

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-02-24 Thread Andy Fan
;>> >>> >>> On Tue, Feb 11, 2020 at 12:22 AM Ashutosh Bapat < >>> ashutosh.bapat....@gmail.com> wrote: >>> >>>> >>>> >>>>> >>>>> [PATCH] Erase the distinctClause if the result is unique by >>>>>

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-02-24 Thread Andy Fan
gt; >>> >>> >>>> >>>> [PATCH] Erase the distinctClause if the result is unique by >>>> definition >>>> >>> >>> I forgot to mention this in the last round of comments. Your patch was >>> actually removing

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-02-13 Thread Andy Fan
gt; ashutosh.bapat@gmail.com> wrote: > > > > > >>> > > >>> [PATCH] Erase the distinctClause if the result is unique by > > >>> definition > > >> > > >> I forgot to mention this in the last round of comments. Your patch was > &g

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-02-13 Thread Julien Rouhaud
On Tue, Feb 11, 2020 at 10:06:17PM +0530, Ashutosh Bapat wrote: > On Tue, Feb 11, 2020 at 8:27 AM Andy Fan wrote: > > > On Tue, Feb 11, 2020 at 12:22 AM Ashutosh Bapat < > > ashutosh.bapat@gmail.com> wrote: > > > >>> > >>> [PATC

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-02-11 Thread Ashutosh Bapat
On Tue, Feb 11, 2020 at 8:27 AM Andy Fan wrote: > > > On Tue, Feb 11, 2020 at 12:22 AM Ashutosh Bapat < > ashutosh.bapat@gmail.com> wrote: > >> >> >>> >>> [PATCH] Erase the distinctClause if the result is unique by >>> defini

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-02-11 Thread Ashutosh Bapat
On Mon, Feb 10, 2020 at 10:57 PM Tom Lane wrote: > Ashutosh Bapat writes: > >> On Sat, Feb 8, 2020 at 12:53 PM Andy Fan > wrote: > >> Do you mean adding some information into PlannerInfo, and when we > create > >> a node for Unique/HashAggregate/Group, we can just create a dummy node? > > > N

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-02-11 Thread Julien Rouhaud
On Tue, Feb 11, 2020 at 08:14:14PM +0800, Andy Fan wrote: > On Tue, Feb 11, 2020 at 3:56 PM Julien Rouhaud wrote: > > > > > > > and if we prepare sql outside a transaction, and execute it in the > > > transaction, the other session can't drop the constraint until the > > > transaction is ended. >

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-02-11 Thread Andy Fan
On Tue, Feb 11, 2020 at 3:56 PM Julien Rouhaud wrote: > > > > and if we prepare sql outside a transaction, and execute it in the > > transaction, the other session can't drop the constraint until the > > transaction is ended. > > And what if you create a view on top of a query containing a distin

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-02-11 Thread Andy Fan
On Tue, Feb 11, 2020 at 3:56 PM Julien Rouhaud wrote: > On Tue, Feb 11, 2020 at 10:57:26AM +0800, Andy Fan wrote: > > On Tue, Feb 11, 2020 at 12:22 AM Ashutosh Bapat < > > ashutosh.bapat@gmail.com> wrote: > > > > > I forgot to mention this in the last round of comments. Your patch was > > > a

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-02-10 Thread Julien Rouhaud
On Tue, Feb 11, 2020 at 10:57:26AM +0800, Andy Fan wrote: > On Tue, Feb 11, 2020 at 12:22 AM Ashutosh Bapat < > ashutosh.bapat@gmail.com> wrote: > > > I forgot to mention this in the last round of comments. Your patch was > > actually removing distictClause from the Query structure. Please avoi

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-02-10 Thread Andy Fan
On Tue, Feb 11, 2020 at 12:22 AM Ashutosh Bapat < ashutosh.bapat@gmail.com> wrote: > > >> >> [PATCH] Erase the distinctClause if the result is unique by >> definition >> > > I forgot to mention this in the last round of comments. Your patch was

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-02-10 Thread Tom Lane
Ashutosh Bapat writes: >> On Sat, Feb 8, 2020 at 12:53 PM Andy Fan wrote: >> Do you mean adding some information into PlannerInfo, and when we create >> a node for Unique/HashAggregate/Group, we can just create a dummy node? > Not so much as PlannerInfo but something on lines of PathKey. See P

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-02-10 Thread Ashutosh Bapat
s example, it seems that the distinct operation can be dropped >> because (a, b) is a primary key. Is my understanding correct? >> > > Yes, you are correct. Actually I added then to commit message, > but it's true that I should have copied them in this email body as

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-02-07 Thread Andy Fan
o commit message, but it's true that I should have copied them in this email body as well. so copy it now. [PATCH] Erase the distinctClause if the result is unique by definition For a single relation, we can tell it by any one of the following is true: 1. The pk is in the target list. 2. T

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-02-07 Thread Ashutosh Bapat
Hi Andy, What might help is to add more description to your email message like giving examples to explain your idea. Anyway, I looked at the testcases you added for examples. +create table select_distinct_a(a int, b char(20), c char(20) not null, d int, e int, primary key(a, b)); +set enable_mer

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-02-05 Thread Andy Fan
update the patch with considering the semi/anti join. Can anyone help to review this patch? Thanks On Fri, Jan 31, 2020 at 8:39 PM Andy Fan wrote: > Hi: > > I wrote a patch to erase the distinctClause if the result is unique by > definition, I find this because a user switch this code from o

[PATCH] Erase the distinctClause if the result is unique by definition

2020-01-31 Thread Andy Fan
Hi: I wrote a patch to erase the distinctClause if the result is unique by definition, I find this because a user switch this code from oracle to PG and find the performance is bad due to this, so I adapt pg for this as well. This patch doesn't work for a well-written SQL, but some drawback of