Re: Pathify RHS unique-ification for semijoin planning

2025-09-03 Thread Andrei Lepikhov
On 3/9/2025 11:12, Richard Guo wrote: On Tue, Sep 2, 2025 at 7:56 PM Andrei Lepikhov wrote: No questions, it is good enough optimisation. I'm worried only about implementation: It creates one more RelOptInfo that may look like a baserel, but we can't find it by find_base_rel or even find_join_r

Re: Pathify RHS unique-ification for semijoin planning

2025-09-03 Thread Richard Guo
On Tue, Sep 2, 2025 at 7:56 PM Andrei Lepikhov wrote: > No questions, it is good enough optimisation. I'm worried only about > implementation: It creates one more RelOptInfo that may look like a > baserel, but we can't find it by find_base_rel or even find_join_rel. It > seems a little inconsisten

Re: Pathify RHS unique-ification for semijoin planning

2025-09-02 Thread Andrei Lepikhov
On 2/9/2025 12:10, Richard Guo wrote: So this specific case runs about 3.7 times faster, which is really nice.No questions, it is good enough optimisation. I'm worried only about implementation: It creates one more RelOptInfo that may look like a baserel, but we can't find it by find_base_rel o

Re: Pathify RHS unique-ification for semijoin planning

2025-09-02 Thread Richard Guo
On Wed, Jul 23, 2025 at 5:11 PM Álvaro Herrera wrote: > As a very trivial test on this patch, I ran the query in your opening > email, both with and without the patch, scaling up the size of the table > a little bit. > This is a really nice improvement. I think we could find queries that > are a

Re: Pathify RHS unique-ification for semijoin planning

2025-08-18 Thread Richard Guo
On Mon, Aug 18, 2025 at 3:07 PM Richard Guo wrote: > Here's the updated version of the patch, which renames the macro > IS_UNIQUEIFIED_REL to RELATION_WAS_MADE_UNIQUE, and includes some > comment updates as well. I plan to push it soon, barring any > objections. Pushed. > This patch removes the

Re: Pathify RHS unique-ification for semijoin planning

2025-08-12 Thread Richard Guo
On Wed, Aug 13, 2025 at 11:27 AM Tom Lane wrote: > Richard Guo writes: > > In this patch, the only instance that doesn't follow the "unique-ify" > > form is the macro IS_UNIQUEIFIED_REL, as dashes are not allowed in C > > identifiers. Maybe a better alternative is IS_RELATION_UNIQUE? Any > > su

Re: Pathify RHS unique-ification for semijoin planning

2025-08-12 Thread Tom Lane
Richard Guo writes: > Given this, I'd prefer to stick with "unique-ify", for consistency > with the majority usage in the codebase. +1. (Not but what I might've been responsible for many of the existing usages, so my opinion is perhaps counting twice here.) > In this patch, the only instance th

Re: Pathify RHS unique-ification for semijoin planning

2025-08-12 Thread Richard Guo
On Wed, Aug 13, 2025 at 1:38 AM Tom Lane wrote: > =?utf-8?Q?=C3=81lvaro?= Herrera writes: > > No review, but apparently "uniquify" is more widely accepted than > > "uniqueify". > Personally I'd write "unique-ify", seeing that neither of the forms > without the dash are considered good English.

Re: Pathify RHS unique-ification for semijoin planning

2025-08-12 Thread Tom Lane
=?utf-8?Q?=C3=81lvaro?= Herrera writes: > No review, but apparently "uniquify" is more widely accepted than > "uniqueify". Personally I'd write "unique-ify", seeing that neither of the forms without the dash are considered good English. Of course, if you need to make identifiers out of this, tha

Re: Pathify RHS unique-ification for semijoin planning

2025-08-12 Thread Álvaro Herrera
On 2025-Aug-12, Richard Guo wrote: > Does anyone plan to review this patch further? I intend to push it in > two weeks unless there are any objections or additional comments. No review, but apparently "uniquify" is more widely accepted than "uniqueify". No dictionary lists either words AFAICS,

Re: Pathify RHS unique-ification for semijoin planning

2025-08-11 Thread Richard Guo
On Mon, Aug 4, 2025 at 11:08 AM Richard Guo wrote: > The v5 patch does not apply anymore, and here is a new rebase. There > are two main changes in v6: > > * I choose to use the check I proposed earlier to determine whether a > relation has been unique-ified in costsize.c. > > * Now that the only

Re: Pathify RHS unique-ification for semijoin planning

2025-08-07 Thread wenhui qiu
Hi Richard Thanks for your feedback. I agree this approach is better for keeping the code style consistent. Thanks On Fri, Aug 8, 2025 at 9:39 AM Richard Guo wrote: > On Thu, Aug 7, 2025 at 6:04 PM wenhui qiu wrote: > > In light of this commit ( > https://github.com/postgres/postgres/comm

Re: Pathify RHS unique-ification for semijoin planning

2025-08-07 Thread Richard Guo
On Thu, Aug 7, 2025 at 6:04 PM wenhui qiu wrote: > In light of this commit > (https://github.com/postgres/postgres/commit/e035863c9a04beeecc254c3bfe48dab58e389e10), > I also recommend changing the macro to a static inline function. Macros are > harder to debug and lack type safety. I'm incline

Re: Pathify RHS unique-ification for semijoin planning

2025-08-07 Thread wenhui qiu
HI Richard Guo +/* + * Is given relation unique-ified? + * + * When the nominal jointype is JOIN_INNER, sjinfo->jointype is JOIN_SEMI, and + * the given rel is exactly the RHS of the semijoin, it indicates that the rel + * has been unique-ified. + */ +#define IS_UNIQUEIFIED_REL(rel, sjinfo, nominal

Re: Pathify RHS unique-ification for semijoin planning

2025-08-03 Thread Richard Guo
On Fri, Aug 1, 2025 at 11:58 PM Alexandra Wang wrote: > While reviewing the code, the following diff concerns me: > if (joinrel->consider_parallel && > - save_jointype != JOIN_UNIQUE_OUTER && > - save_jointype != JOIN_FULL && > - save_jointype != JOIN_RIGHT && > - save_jointype != JOIN_RIGHT_AN

Re: Pathify RHS unique-ification for semijoin planning

2025-08-01 Thread Alexandra Wang
Hi Richard, On Wed, Jul 30, 2025 at 9:08 PM Richard Guo wrote: > On Thu, Jul 31, 2025 at 10:33 AM Alexandra Wang > wrote: > > Thanks for the patch! I applied your patch and played around with it. > > Thanks for trying out this patch. > > > I have a question about the following test case you add

Re: Pathify RHS unique-ification for semijoin planning

2025-07-31 Thread Richard Guo
On Thu, Jul 31, 2025 at 9:49 PM wenhui qiu wrote: > > This seems to be another case where the planner chooses a suboptimal > > plan due to inaccurate cost estimates. > Agree ,I increased some rows , set enable_hashagg to on and off ,There's no > difference in execution time. The execution plan

Re: Pathify RHS unique-ification for semijoin planning

2025-07-31 Thread wenhui qiu
HI > This is a very interesting observation. In fact, with the original v5 > patch, you can produce both plans by setting enable_hashagg on and > off. > set enable_hashagg to on; > Incremental Sort (cost=91.95..277.59 rows=2500 width=16) >(actual time=31.960..147.040 rows

Re: Pathify RHS unique-ification for semijoin planning

2025-07-31 Thread Richard Guo
On Thu, Jul 31, 2025 at 1:08 PM Richard Guo wrote: > On Thu, Jul 31, 2025 at 10:33 AM Alexandra Wang > wrote: > > While looking at the code, I also had a question about the following > > changes in costsize.c: > > > > --- a/src/backend/optimizer/path/costsize.c > > +++ b/src/backend/optimizer/pat

Re: Pathify RHS unique-ification for semijoin planning

2025-07-30 Thread Richard Guo
On Thu, Jul 31, 2025 at 10:33 AM Alexandra Wang wrote: > Thanks for the patch! I applied your patch and played around with it. Thanks for trying out this patch. > I have a question about the following test case you added in > subselect.sql: > I was under the impression that we wanted Unique on

Re: Pathify RHS unique-ification for semijoin planning

2025-07-30 Thread Alexandra Wang
Hi Richard, Thanks for the patch! I applied your patch and played around with it. I have a question about the following test case you added in subselect.sql: +-- Ensure that we can unique-ify expressions more complex than plain Vars +explain (verbose, costs off) +select * from semijoin_unique_tb

Re: Pathify RHS unique-ification for semijoin planning

2025-07-23 Thread Richard Guo
On Wed, Jul 23, 2025 at 5:11 PM Álvaro Herrera wrote: > As a very trivial test on this patch, I ran the query in your opening > email, both with and without the patch, scaling up the size of the table > a little bit. > This is a really nice improvement. I think we could find queries that > are a

Re: Pathify RHS unique-ification for semijoin planning

2025-07-23 Thread Álvaro Herrera
Hello, As a very trivial test on this patch, I ran the query in your opening email, both with and without the patch, scaling up the size of the table a little bit. So I did this drop table if exists t; create table t(a int, b int); insert into t select i % 10, i from

Re: Pathify RHS unique-ification for semijoin planning

2025-07-03 Thread Richard Guo
On Thu, Jul 3, 2025 at 7:06 PM Richard Guo wrote: > This patch does not apply again, so here is a new rebase. > > This version also fixes an issue related to parameterized paths: if > the RHS has LATERAL references to the LHS, unique-ification becomes > meaningless because the RHS depends on the L

Re: Pathify RHS unique-ification for semijoin planning

2025-05-27 Thread Richard Guo
On Thu, May 22, 2025 at 4:05 PM Richard Guo wrote: > Therefore, I'm thinking that maybe we could create a new RelOptInfo > for the RHS rel to represent its unique-ified version, and then > generate all worthwhile paths for it, similar to how it's done in > create_distinct_paths(). Since this is l

Re: Pathify RHS unique-ification for semijoin planning

2025-05-22 Thread wenhui qiu
On Thu, 22 May 2025 at 17:28, Andy Fan wrote: > Richard Guo writes: > > Hi, > > > However, in the case of sort-based implementation, > > this function pays no attention to the subpath's pathkeys or the > > pathkeys of the resulting output. > > Good finding! > > > > > In addition to this specific

Re: Pathify RHS unique-ification for semijoin planning

2025-05-22 Thread Andy Fan
Richard Guo writes: Hi, > However, in the case of sort-based implementation, > this function pays no attention to the subpath's pathkeys or the > pathkeys of the resulting output. Good finding! > > In addition to this specific issue, it seems to me that there are > other potential issues in cr