Re: sqlsmith crash incremental sort

2020-04-23 Thread Richard Guo
On Thu, Apr 23, 2020 at 6:59 AM Tomas Vondra wrote: > I've pushed fix with the DEFAULT_NUM_DISTINCT. The input comes from a > set operation (which is where we call generate_append_tlist), so it's > probably fairly unique, so maybe we should use input_tuples. But it's > not guaranteed, so DEFAULT_

Pulling up sublink may break join-removal logic

2020-04-28 Thread Richard Guo
Hi hackers, I happened to notice $subject and not sure if it's an issue or not. When we're trying to remove a LEFT JOIN, one of the requirements is the inner side needs to be a single baserel. If there is a join qual that is a sublink and can be converted to a semi join with the inner side rel, th

Re: Pulling up sublink may break join-removal logic

2020-04-28 Thread Richard Guo
On Wed, Apr 29, 2020 at 8:23 AM David Rowley wrote: > On Tue, 28 Apr 2020 at 19:04, Richard Guo wrote: > > I happened to notice $subject and not sure if it's an issue or not. When > > we're trying to remove a LEFT JOIN, one of the requirements is the inner > > si

Remove unnecessary relabel stripping

2020-04-29 Thread Richard Guo
Hi hackers, Per discussion in [1], we don't need to strip relabel for the expr explicitly before calling pull_varnos() to retrieve all mentioned relids. pull_varnos() would recurse into T_RelabelType nodes. Add a patch to remove that and simplify the code a bit. [1] https://www.postgresql.org/me

Re: Remove unnecessary relabel stripping

2020-04-29 Thread Richard Guo
On Thu, Apr 30, 2020 at 8:11 AM Tomas Vondra wrote: > On Wed, Apr 29, 2020 at 03:51:40PM +0800, Richard Guo wrote: > >Hi hackers, > > > >Per discussion in [1], we don't need to strip relabel for the expr > >explicitly before calling pull_varnos() to retrieve all

Re: Optimizer docs typos

2020-05-18 Thread Richard Guo
In this same README doc, another suspicious typo to me, which happens in section "Optimizer Functions", is in the prefix to query_planner(), we should have three dashes, rather than two, since query_planner() is called within grouping_planner(). diff --git a/src/backend/optimizer/README b/src/back

About reducing EXISTS sublink

2020-05-22 Thread Richard Guo
Hi hackers, For EXISTS SubLink, in some cases the subquery can be reduced to constant TRUE or FALSE, based on the knowledge that it's being used in EXISTS(). One such case is when the subquery has aggregates without GROUP BY or HAVING, and we know its result is exactly one row, unless that row is

Re: Optimizer docs typos

2020-05-22 Thread Richard Guo
On Fri, May 22, 2020 at 2:53 PM Etsuro Fujita wrote: > > Done. > Thanks! Thanks Richard

Re: About reducing EXISTS sublink

2020-05-25 Thread Richard Guo
On Fri, May 22, 2020 at 10:59 PM David G. Johnston < david.g.johns...@gmail.com> wrote: > On Friday, May 22, 2020, Richard Guo wrote: > >> Hi hackers, >> >> For EXISTS SubLink, in some cases the subquery can be reduced to >> constant TRUE or FALSE, based on t

Re: Fast path for empty relids in check_outerjoin_delay()

2019-02-12 Thread Richard Guo
Yes sure. Thanks Tom. On Fri, Feb 1, 2019 at 1:04 AM Tom Lane wrote: > Since we've reached the end of the January commitfest, and it's pretty > clear that this patch isn't going to get committed in anything like > its current form, I'm going to close the CF entry as returned with > feedback. If

Re: NOT IN subquery optimization

2019-02-26 Thread Richard Guo
Greenplum Database does this optimization. The idea is to use a new join type, let's call it JOIN_LASJ_NOTIN, and its semantic regarding NULL is defined as below: 1. If there is a NULL in the outer side, and the inner side is empty, the NULL should be part of the outputs. 2. If there is a NULL

Re: NOT IN subquery optimization

2019-02-26 Thread Richard Guo
On Wed, Feb 27, 2019 at 4:52 AM David Rowley wrote: > On Wed, 27 Feb 2019 at 03:07, Jim Finnerty wrote: > > If you're proposing to do that for this thread then I can take my > planner only patch somewhere else. I only posted my patch as I pretty > much already had what I thought you were origin

Fast path for empty relids in check_outerjoin_delay()

2018-12-11 Thread Richard Guo
Hi all, Function check_outerjoin_delay() is used to detect whether a qual referencing the given relids must be delayed in application due to the presence of a lower outer join. If the given relids are empty, we should be able to return from this function via the same fast path as for the case tha

Re: Fast path for empty relids in check_outerjoin_delay()

2019-01-08 Thread Richard Guo
On Fri, Jan 4, 2019 at 10:32 PM Peter Eisentraut < peter.eisentr...@2ndquadrant.com> wrote: > On 12/12/2018 08:32, Richard Guo wrote: > > This small revise is not expected to bring performance improvements, but > > can improve the readability of the code that for empty re

Re: Fast path for empty relids in check_outerjoin_delay()

2019-01-09 Thread Richard Guo
On Tue, Jan 8, 2019 at 11:29 PM Tom Lane wrote: > Richard Guo writes: > > On Fri, Jan 4, 2019 at 10:32 PM Peter Eisentraut < > > peter.eisentr...@2ndquadrant.com> wrote: > >> I think code readability and maintainability would be improved by having > >>

Re: Parallel grouping sets

2020-02-03 Thread Richard Guo
Hi Jesse, Thanks for reviewing these two patches. On Sat, Jan 25, 2020 at 6:52 AM Jesse Zhang wrote: > > I glanced over both patches. Just the opposite, I have a hunch that v3 > is always better than v5. Here's my 6-minute understanding of both. > > v5 (the one with a simple partial aggregate)

Re: Parallel grouping sets

2020-02-03 Thread Richard Guo
Hi Amit, Thanks for reviewing these two patches. On Sat, Jan 25, 2020 at 6:31 PM Amit Kapila wrote: > > This is what I also understood after reading this thread. So, my > question is why not just review v3 and commit something on those lines > even though it would take a bit more time. It is

Re: [HACKERS] WIP: Aggregation push-down

2020-02-06 Thread Richard Guo
Hi, I've been looking at the 'make_join_rel()' part of the patch, and I'm aware that, if we are joining A and B, a 'grouped join rel (AB)' would be created besides the 'plain join rel (AB)', and paths are added by 1) applying partial aggregate to the paths of the 'plain join rel (AB)', or 2) joini

Re: Parallel grouping sets

2020-02-24 Thread Richard Guo
To summarize the current state of parallel grouping sets, we now have two available implementations for it. 1) Each worker performs an aggregation step, producing a partial result for each group of which that process is aware. Then the partial results are gathered to the leader, which then perform

Trying to pull up EXPR SubLinks

2020-02-27 Thread Richard Guo
Hi All, Currently we will not consider EXPR_SUBLINK when pulling up sublinks and this would cause performance issues for some queries with the form of: 'a > (SELECT agg(b) from ...)' as described in [1]. So here is a patch as an attempt to pull up EXPR SubLinks. The idea, which is based on Greenp

Re: Trying to pull up EXPR SubLinks

2020-02-28 Thread Richard Guo
On Fri, Feb 28, 2020 at 3:02 PM Andy Fan wrote: > > > On Fri, Feb 28, 2020 at 2:35 PM Richard Guo > wrote: > >> Hi All, >> >> Currently we will not consider EXPR_SUBLINK when pulling up sublinks and >> this would cause performance issues for some queries

Re: Trying to pull up EXPR SubLinks

2020-03-02 Thread Richard Guo
On Fri, Feb 28, 2020 at 11:35 PM Tom Lane wrote: > Richard Guo writes: > > On Fri, Feb 28, 2020 at 3:02 PM Andy Fan > wrote: > >> Glad to see this. I think the hard part is this transform is not > *always* > >> good. for example foo.a only has 1 rows, but b

Parallel grouping sets

2019-06-11 Thread Richard Guo
Hi all, Paul and I have been hacking recently to implement parallel grouping sets, and here we have two implementations. Implementation 1 Attached is the patch and also there is a github branch [1] for this work. Parallel aggregation has already been supported in PostgreSQL and

Re: Parallel grouping sets

2019-06-13 Thread Richard Guo
On Thu, Jun 13, 2019 at 12:29 PM David Rowley wrote: > On Wed, 12 Jun 2019 at 14:59, Richard Guo wrote: > > Implementation 1 > > > Parallel aggregation has already been supported in PostgreSQL and it is > > implemented by aggregating in two stages. First, each worker per

Fix up grouping sets reorder

2019-06-17 Thread Richard Guo
Hi all, During the reorder of grouping sets into correct prefix order, if only one aggregation pass is needed, we follow the order of the ORDER BY clause to the extent possible, to minimize the chance that we add unnecessary sorts. This is implemented in preprocess_grouping_sets --> reorder_groupi

Re: [HACKERS] WIP: Aggregation push-down

2019-07-02 Thread Richard Guo
On Fri, Mar 1, 2019 at 12:01 AM Antonin Houska wrote: > Tom Lane wrote: > > > Antonin Houska writes: > > > Michael Paquier wrote: > > >> Latest patch set fails to apply, so moved to next CF, waiting on > > >> author. > > > > > Rebased. > > > > This is in need of rebasing again :-(. I went ahe

Re: [HACKERS] WIP: Aggregation push-down

2019-07-11 Thread Richard Guo
On Tue, Jul 9, 2019 at 9:47 PM Antonin Houska wrote: > Richard Guo wrote: > > > Another rebase is needed for the patches. > > Done. > I didn't fully follow the whole thread and mainly looked into the latest patch set. So what are the considerations for abandoning the

Re: [HACKERS] WIP: Aggregation push-down

2019-07-15 Thread Richard Guo
On Fri, Jul 12, 2019 at 4:42 PM Antonin Houska wrote: > Richard Guo wrote: > > > I didn't fully follow the whole thread and mainly looked into the > > latest > > patch set. So what are the considerations for abandoning the > > aggmultifn > > concept?

Implement predicate propagation for non-equivalence clauses

2018-09-04 Thread Richard Guo
Hi, As we know, current planner will generate additional restriction clauses from equivalence clauses. This will generally lower the total cost because some of tuples may be filtered out before joins. In this patch, we are trying to do the similar deduction, from non-equivalence clauses, that is,

Re: Implement predicate propagation for non-equivalence clauses

2018-09-05 Thread Richard Guo
On Wed, Sep 5, 2018 at 2:55 PM, Heikki Linnakangas wrote: > On 05/09/18 09:34, Richard Guo wrote: > >> Hi, >> >> As we know, current planner will generate additional restriction clauses >> from >> equivalence clauses. This will generally lower the total cost

Re: Implement predicate propagation for non-equivalence clauses

2018-09-05 Thread Richard Guo
Hi, On Wed, Sep 5, 2018 at 2:56 PM, Tom Lane wrote: > Richard Guo writes: > > In this patch, we are trying to do the similar deduction, from > > non-equivalence > > clauses, that is, A=B AND f(A) implies A=B AND f(A) and f(B), under some > > restrictions on f. >

Re: Implement predicate propagation for non-equivalence clauses

2018-09-26 Thread Richard Guo
18 at 12:01 PM Richard Guo wrote: > Hi, > > On Wed, Sep 5, 2018 at 2:56 PM, Tom Lane wrote: > >> Richard Guo writes: >> > In this patch, we are trying to do the similar deduction, from >> > non-equivalence >> > clauses, that is, A=B AND f(A) implies A

Re: Inconsistent RestrictInfo serial numbers

2024-10-08 Thread Richard Guo
On Tue, Oct 8, 2024 at 9:02 PM Ashutosh Bapat wrote: > But I don't see any relation specific information being > considered while deciding whether the clause is constant false. Isn't rel->notnullattnums relation specific information? We need this information to decide if a Var cannot be NULL. >

Re: Inconsistent RestrictInfo serial numbers

2024-10-08 Thread Richard Guo
On Wed, Oct 9, 2024 at 11:15 AM Andrei Lepikhov wrote: > On 10/8/24 18:20, Richard Guo wrote: > > To fix, I think we can reset the root->last_rinfo_serial counter after > > generating the additional constant-FALSE RestrictInfo. Please see > > attached. > The approa

Inconsistent RestrictInfo serial numbers

2024-10-08 Thread Richard Guo
I ran into an "ERROR: variable not found in subplan target lists" error, which can be reproduced with the following query. create table t (a int primary key, b int); insert into t select i, i from generate_series(1, 10)i; analyze t; explain (costs off) select 1 from t t1 left join (t

Re: Eager aggregation, take 3

2024-10-05 Thread Richard Guo
On Sat, Oct 5, 2024 at 6:23 PM Richard Guo wrote: > > On Fri, Sep 27, 2024 at 11:53 AM Richard Guo wrote: > > Here is an updated version of this patch that fixes the rowcount > > estimate issue along this routine. (see set_joinpath_size.) > > I have worked on inventing

Re: Allow pushdown of HAVING clauses with grouping sets

2024-10-09 Thread Richard Guo
On Wed, Sep 11, 2024 at 11:43 AM Richard Guo wrote: > In some cases, we may want to transfer a HAVING clause into WHERE in > hopes of eliminating tuples before aggregation instead of after. > > Previously, we couldn't do this if there were any nonempty grouping > sets, beca

Re: Wrong results with grouping sets

2024-10-10 Thread Richard Guo
On Thu, Oct 10, 2024 at 4:06 PM Richard Guo wrote: > While we can fix this issue by propagating the hasGroupRTE mark from > the EXISTS subquery to the parent, a better fix might be to remove the > subquery's RTE_GROUP entry, since we have dropped the subquery's > groupCla

Re: Wrong results with grouping sets

2024-10-10 Thread Richard Guo
On Thu, Oct 10, 2024 at 2:39 PM David Rowley wrote: > create table a(a int); > explain select * from a where exists(Select 1 from a a2 where a.a = > a2.a group by a); > CREATE TABLE > server closed the connection unexpectedly > > TRAP: failed Assert("parse->hasGroupRTE"), File: > "../src/backend/

Re: Inconsistent RestrictInfo serial numbers

2024-10-09 Thread Richard Guo
On Wed, Oct 9, 2024 at 9:30 PM Ashutosh Bapat wrote: > On Wed, Oct 9, 2024 at 7:45 AM Richard Guo wrote: > > On Tue, Oct 8, 2024 at 9:02 PM Ashutosh Bapat > > wrote: > > > But I don't see any relation specific information being > > > considered while d

Re: Why don't we consider explicit Incremental Sort?

2024-10-09 Thread Richard Guo
On Sun, Sep 22, 2024 at 1:38 PM David Rowley wrote: > Just looking at the commit message: > > > The rationale is based on the assumption that incremental sort is > > always faster than full sort when there are presorted keys, a premise > > that has been applied in various parts of the code. This

Re: Eager aggregation, take 3

2024-10-29 Thread Richard Guo
On Fri, Oct 18, 2024 at 12:44 PM jian he wrote: > 1. root->parse->resultRelation > 0 > just be 100% sure we are only dealing with SELECT, or we can add > Assert at the end of setup_eager_aggregation. Can GROUP BY clauses be used in INSERT/UPDATE/DELETE/MERGE statements? If not, I think there

Re: Eager aggregation, take 3

2024-10-29 Thread Richard Guo
On Fri, Oct 18, 2024 at 10:22 PM jian he wrote: > So overall I doubt here BTEQUALIMAGE_PROC flag usage is correct. The BTEQUALIMAGE_PROC flag is used to prevent eager aggregation for types whose equality operators do not imply bitwise equality, such as NUMERIC. After a second thought, I think it

Re: Inconsistent RestrictInfo serial numbers

2024-10-27 Thread Richard Guo
On Fri, Oct 11, 2024 at 3:02 PM Ashutosh Bapat wrote: > 1. What strikes me as odd in your patch is that the last_rinfo_serial > is reset so far away from the new clause that will be created with the > next last_rinfo_serial OR the clause whose clones are being created. > It either indicates anothe

Re: Eager aggregation, take 3

2024-11-11 Thread Richard Guo
On Tue, Nov 12, 2024 at 1:30 AM Robert Haas wrote: > On Sun, Nov 10, 2024 at 7:52 PM Richard Guo wrote: > > Hmm, currently we only consider grouped aggregation for eager > > aggregation. For grouped aggregation, the window function's > > arguments, as well as the PART

Re: Wrong results with grouping sets

2024-10-24 Thread Richard Guo
On Thu, Oct 10, 2024 at 6:51 PM Richard Guo wrote: > On Thu, Oct 10, 2024 at 4:06 PM Richard Guo wrote: > > While we can fix this issue by propagating the hasGroupRTE mark from > > the EXISTS subquery to the parent, a better fix might be to remove the > > subquery's

Some dead code in get_param_path_clause_serials()

2024-11-13 Thread Richard Guo
The function get_param_path_clause_serials() is used to get the set of pushed-down clauses enforced within a parameterized Path. Since we don't currently support parameterized MergeAppend paths, and it doesn't look like that is going to change anytime soon (as explained in the comments for generat

Re: Reordering DISTINCT keys to match input path's pathkeys

2024-11-12 Thread Richard Guo
On Mon, Oct 28, 2024 at 6:15 PM Andrei Lepikhov wrote: > On 6/7/24 16:46, Richard Guo wrote: > > This patch does not apply any more, so here is a new rebase, with some > > tweaks to the comments. > This patch needs a minor rebase again. > After skimming the code, I want to s

Re: Eager aggregation, take 3

2024-10-31 Thread Richard Guo
On Thu, Oct 31, 2024 at 12:25 AM Robert Haas wrote: > Well, the key thing here is the reasoning, which you don't really > spell out either here or in the patch. The patch's justification for > the use of BTEQUALIMAGE_PROC Is that, if we use non-equal-image > operators, "we may lose some informatio

Re: Eager aggregation, take 3

2024-10-31 Thread Richard Guo
On Thu, Oct 31, 2024 at 9:16 PM jian he wrote: > > hi. > still trying to understand v13. found a bug. > > minimum test : > drop table if exists t1, t2; > CREATE TABLE t1 (a int, b int, c int); > CREATE TABLE t2 (a int, b int, c int); > SET enable_eager_aggregate TO on; > explain(costs off, setting

Re: Eager aggregation, take 3

2024-10-31 Thread Richard Guo
On Wed, Oct 30, 2024 at 5:06 AM Robert Haas wrote: > On Tue, Sep 24, 2024 at 11:20 PM Richard Guo wrote: > > The reason is that it is very tricky to set the size estimates for a > > grouped join relation. For a non-grouped join relation, we know that > > all its paths h

Re: Eager aggregation, take 3

2024-10-31 Thread Richard Guo
On Tue, Oct 29, 2024 at 9:59 PM Robert Haas wrote: > On Wed, Sep 25, 2024 at 3:03 AM Richard Guo wrote: > > On Wed, Sep 11, 2024 at 10:52 AM Tender Wang wrote: > > > 1. In make_one_rel(), we have the below codes: > > > /* > > > * Build grouped base rel

Re: Eager aggregation, take 3

2024-11-10 Thread Richard Guo
On Wed, Nov 6, 2024 at 11:43 PM Robert Haas wrote: > On Wed, Nov 6, 2024 at 3:22 AM Richard Guo wrote: > > Yeah, ordered aggregates could be a blocker. I think it might be best > > to prevent the use of eager aggregation if root->numOrderedAggs > 0 > > for now. >

Re: Why don't we consider explicit Incremental Sort?

2024-09-22 Thread Richard Guo
On Sun, Sep 22, 2024 at 1:38 PM David Rowley wrote: > Just looking at the commit message: > > > The rationale is based on the assumption that incremental sort is > > always faster than full sort when there are presorted keys, a premise > > that has been applied in various parts of the code. This

Re: Eager aggregation, take 3

2024-09-24 Thread Richard Guo
On Wed, Aug 28, 2024 at 9:01 PM Robert Haas wrote: > On Tue, Aug 27, 2024 at 11:57 PM Tender Wang wrote: > > I haven't look all of them. I just pick few simple plan test(e.g. 19.sql, > > 45.sql). > > For example, 19.sql, eager agg pushdown doesn't get large gain, but a little > > performance reg

Re: Why don't we consider explicit Incremental Sort?

2024-09-23 Thread Richard Guo
On Mon, Sep 23, 2024 at 10:01 PM Tomas Vondra wrote: > You don't need any skew. Consider this perfectly uniform dataset: > > create table t (a int, b int); > insert into t select 10 * random(), 100 * random() > from generate_series(1,100) s(i); > create index on t (a); > vacuum analyze;

Re: Eager aggregation, take 3

2024-09-25 Thread Richard Guo
On Fri, Sep 13, 2024 at 3:48 PM Tender Wang wrote: > Since MERGE/SPLIT partition has been reverted, the tests *partition_merge* > and *partition_split* should be removed > from parallel_schedule. After doing the above, the 0002 patch can be applied. Yeah, that's what I need to do. Thanks Ric

Re: Why don't we consider explicit Incremental Sort?

2024-09-20 Thread Richard Guo
On Sun, Sep 15, 2024 at 6:01 AM Tomas Vondra wrote: > Hmmm ... I wasn't involved in that discussion and I haven't studied the > thread now, but this seems a bit weird to me. If the presorted keys have > low cardinality, can't the full sort be faster? > > Maybe it's not possible with the costing ch

Re: Eager aggregation, take 3

2024-09-24 Thread Richard Guo
On Thu, Sep 5, 2024 at 9:40 AM Tender Wang wrote: > 1. in setup_eager_aggregation(), before calling create_agg_clause_infos(), > it does > some checks if eager aggregation is available. Can we move those checks into > a function, > for example, can_eager_agg(), like can_partial_agg() does? We

Re: Eager aggregation, take 3

2024-09-25 Thread Richard Guo
On Wed, Sep 11, 2024 at 10:52 AM Tender Wang wrote: > 1. In make_one_rel(), we have the below codes: > /* > * Build grouped base relations for each base rel if possible. > */ > setup_base_grouped_rels(root); > > As far as I know, each base rel only has one grouped base relation, if > possible. >

Re: Inconsistent RestrictInfo serial numbers

2024-11-06 Thread Richard Guo
On Tue, Oct 8, 2024 at 8:20 PM Richard Guo wrote: > > I ran into an "ERROR: variable not found in subplan target lists" > error, which can be reproduced with the following query. > To fix, I think we can reset the root->last_rinfo_serial counter after > generating t

Re: Eager aggregation, take 3

2024-11-06 Thread Richard Guo
On Fri, Nov 1, 2024 at 9:42 PM Robert Haas wrote: > On Fri, Nov 1, 2024 at 2:21 AM Richard Guo wrote: > > ... an aggregated row from the partial > > aggregation matches the other side of the join if and only if each row > > in the partial group does, thereby ensuring that

Re: Inconsistent RestrictInfo serial numbers

2024-11-07 Thread Richard Guo
On Wed, Nov 6, 2024 at 11:06 PM Ashutosh Bapat wrote: > On Wed, Nov 6, 2024 at 4:10 PM Richard Guo wrote: > > I intend to push this patch shortly, barring any objections, so that > > it can catch up with the next minor release (Nov. 14). > > I didn't get time to respon

Re: Reordering DISTINCT keys to match input path's pathkeys

2024-11-13 Thread Richard Guo
On Wed, Nov 13, 2024 at 7:36 PM Andrei Lepikhov wrote: > On 11/13/24 13:49, Richard Guo wrote: > > Thanks for reviewing this patch. After some consideration, I think > > it's not too complex to also apply this optimization to DISTINCT ON. > > The parser already en

Re: An inefficient query caused by unnecessary PlaceHolderVar

2024-11-27 Thread Richard Guo
On Fri, Nov 22, 2024 at 5:08 AM Dmitry Dolgov <9erthali...@gmail.com> wrote: > The patch looks good to me, the implementation is concise and clear. I can't > imagine any visible overhead due to storing lowest_nullable_relids in this > context. The only nitpick I have is about this commentary: > >

Re: Some dead code in get_param_path_clause_serials()

2024-11-13 Thread Richard Guo
On Wed, Nov 13, 2024 at 9:59 PM Andrei Lepikhov wrote: > On 11/13/24 16:34, Richard Guo wrote: > > The function get_param_path_clause_serials() is used to get the set of > > pushed-down clauses enforced within a parameterized Path. Since we > > don't currently support

Re: Converting SetOp to read its two inputs separately

2024-11-14 Thread Richard Guo
On Thu, Nov 14, 2024 at 11:00 AM Tom Lane wrote: > Aside from that minor TODO, the main thing that's left undone in this > patch series is to persuade the thing to exploit presorted input > paths. Right now it fails to do so, as can be seen in some of the > regression test cases, eg > > regressio

Re: Converting SetOp to read its two inputs separately

2024-11-14 Thread Richard Guo
On Thu, Nov 14, 2024 at 6:28 PM Richard Guo wrote: > 1. We need to teach set_operation_ordered_results_useful() that sorted > input paths are also useful for INTERSECT/EXCEPT, so that we can have > setop_pathkeys set for the subqueries. BTW, I noticed a typo in the comment for

Re: POC, WIP: OR-clause support for indexes

2024-11-24 Thread Richard Guo
On Thu, Nov 21, 2024 at 3:34 PM Alexander Korotkov wrote: > I'm going to push this if no objections. Here is an Assert failure in match_orclause_to_indexcol. create table t (a int); create index on t (a); # explain select * from t where a <= 0 or a <= 1; server closed the connection unexpectedl

Re: Reordering DISTINCT keys to match input path's pathkeys

2024-11-25 Thread Richard Guo
On Fri, Nov 22, 2024 at 5:40 PM Andrei Lepikhov wrote: > I wonder if it would be possible to print only three rows instead of 10 > to prove the DISTINCT's correctness. There are ten distinct values in the 'distinct_tbl' test table, so I think it’d better to print all of them to verify correctness

Re: Reordering DISTINCT keys to match input path's pathkeys

2024-11-25 Thread Richard Guo
On Tue, Nov 26, 2024 at 10:12 AM Tom Lane wrote: > Richard Guo writes: > > I've applied some of the changes you suggested in your previous email > > and pushed the updated patch. Thank you for your review. > > The buildfarm's not too happy. It looks like this

Re: Some dead code in get_param_path_clause_serials()

2024-11-25 Thread Richard Guo
On Wed, Nov 13, 2024 at 9:59 PM Andrei Lepikhov wrote: > On 11/13/24 16:34, Richard Guo wrote: > > The function get_param_path_clause_serials() is used to get the set of > > pushed-down clauses enforced within a parameterized Path. Since we > > don't currently support

Re: Reordering DISTINCT keys to match input path's pathkeys

2024-11-25 Thread Richard Guo
On Tue, Nov 26, 2024 at 9:43 AM Richard Guo wrote: > I've applied some of the changes you suggested in your previous email > and pushed the updated patch. Thank you for your review. Ah, the buildfarm is complaining. I shouldn't create tables with the same name in two tes

Re: Reordering DISTINCT keys to match input path's pathkeys

2024-11-25 Thread Richard Guo
On Tue, Nov 26, 2024 at 10:22 AM Richard Guo wrote: > On Tue, Nov 26, 2024 at 10:12 AM Tom Lane wrote: > > Richard Guo writes: > > > I've applied some of the changes you suggested in your previous email > > > and pushed the updated patch. Thank you for your revi

Re: Potential Issue with Redundant Restriction Clauses in get_parameterized_baserel_size for PARTITIONED_REL

2024-11-26 Thread Richard Guo
On Tue, Nov 26, 2024 at 4:57 PM huyajun wrote: > The get_parameterized_baserel_size function does not differentiate for > PARTITIONED_REL and always appends the rel's own restriction clauses. > However, for PARTITIONED_REL, rel->tuples is computed in set_append_rel_size > which comes from the s

Re: An inefficient query caused by unnecessary PlaceHolderVar

2024-12-01 Thread Richard Guo
On Wed, Nov 27, 2024 at 5:45 PM Richard Guo wrote: > I ended up using 'under the same lowest nulling outer join' to > keep consistent with the wording used elsewhere. Please see the > updated patch attached. Commit e032e4c7d computes the nullingrel data for each leaf RTE,

Wrong results with right-semi-joins

2024-12-03 Thread Richard Guo
I ran into $subject and it can be reproduced with the query below. create temp table tbl_rs(a int, b int); insert into tbl_rs select i, i from generate_series(1,10)i; analyze tbl_rs; set enable_nestloop to off; set enable_hashagg to off; select * from tbl_rs t1 where (select a from tbl_rs t2

Re: Wrong results with right-semi-joins

2024-12-09 Thread Richard Guo
On Tue, Dec 3, 2024 at 5:56 PM Richard Guo wrote: > I've traced the root cause to ExecReScanHashJoin, where we neglect to > reset the inner-tuple match flags in the hash table for right-semi > joins when reusing the hash table. It was my oversight in commit > aa86129e1. Attache

Re: An inefficient query caused by unnecessary PlaceHolderVar

2024-12-09 Thread Richard Guo
On Tue, Dec 3, 2024 at 5:33 PM Andrei Lepikhov wrote: > A couple of words about your patch. These few lines of code caused a lot > of discoveries, but in my opinion, they look fine. But I didn't find > negative tests, where we need to wrap a Var with PHV like the following: Pushed after adding tw

Re: Remove unused rel parameter in lookup_var_attr_stats

2025-01-06 Thread Richard Guo
On Fri, Jan 3, 2025 at 11:09 PM Ilia Evdokimov wrote: > I've attached a small patch that remove unused parameter 'rel' from > the static function lookup_var_attr_stats() in extended_stats.c LGTM. It's a static function, and we can easily add this parameter back if it's ever needed. Thanks Richa

Re: Virtual generated columns

2025-01-08 Thread Richard Guo
On Fri, Nov 29, 2024 at 7:14 PM Peter Eisentraut wrote: > Here is a new patch version, with several updates. > - Added support for ALTER TABLE ... SET EXPRESSION. When using ALTER TABLE to set expression for virtual generated columns, we don't enforce a rewrite, which means we don't have the opp

ERROR: corrupt MVNDistinct entry

2024-12-24 Thread Richard Guo
I ran into this error in estimate_multivariate_ndistinct, and it can be reproduced with the query below. create table t (a int, b int); insert into t select 1, 1; create statistics s (ndistinct) on a, b from t; analyze; explain select 1 from t t1 left join (select a c1, coalesce(a) c2 from t

Re: ERROR: corrupt MVNDistinct entry

2024-12-26 Thread Richard Guo
On Wed, Dec 25, 2024 at 6:36 PM Richard Guo wrote: > In v16 and later, the nullingrels within the expression "t2.a + t2.b" > prevent it from being matched to the corresponding expression in > extended statistics, forcing us to use DEFAULT_UNK_SEL(0.005). Furthermore, ev

Re: should we have a fast-path planning for OLTP starjoins?

2025-02-05 Thread Richard Guo
On Wed, Feb 5, 2025 at 5:42 AM Tomas Vondra wrote: > Hmmm, yeah. But that's only for the INNER JOIN case. But I've seen many > of these star join queries with LEFT JOIN too, and then the FKs are not > needed. All you need is a PK / unique index on the other side. > > Perhaps requiring (INNER JOIN

Re: should we have a fast-path planning for OLTP starjoins?

2025-02-05 Thread Richard Guo
On Wed, Feb 5, 2025 at 5:55 AM Tom Lane wrote: > Right now, if we have four tables to join, we have a joinlist > (A B C D). (Really they're integer relids, but let's use names here.) > If we decide to force C to be joined last, it should be sufficient to > convert this to ((A B D) C). If C and D

Re: Unsafe access BufferDescriptors array in BufferGetLSNAtomic()

2025-02-04 Thread Richard Guo
On Thu, Nov 7, 2024 at 7:07 PM Tender Wang wrote: > While learning gist index insert codes, I find a little issue with > BufferGetLSNAtomic(). > At first, it wants to get bufHdr by accessing the buffer descriptor array, as > below: > > BufferDesc *bufHdr = GetBufferDescriptor(buffer - 1); > > Ho

Re: Fix outdated code comments in nodeAgg.c

2025-02-10 Thread Richard Guo
On Sun, Feb 9, 2025 at 11:15 PM Zhang Mingli wrote: > By reading the codes in nodeAgg.c, I found that some comments are outdated. > The comments referred to the function lookup_hash_entry(), which has been > removed in commit b563594. > I've adjusted the comments to refer to the correct function

Re: Adjust tuples estimate for appendrels

2025-02-10 Thread Richard Guo
Thanks for all the reviews. Attached is an updated patch that resolves the review comments. Thanks Richard v2-0001-Adjust-tuples-estimate-for-appendrels.patch Description: Binary data

Re: Virtual generated columns

2025-02-09 Thread Richard Guo
On Sun, Feb 9, 2025 at 7:02 PM Zhang Mingli wrote: > On Feb 9, 2025 at 16:00 +0800, Alexander Lakhin , wrote: > Please look at a planner error with a virtual generated column triggered > by the following script: > CREATE TABLE t(a int, b int GENERATED ALWAYS AS (a * 1)); > > SELECT SUM(CASE WHEN t

Re: should we have a fast-path planning for OLTP starjoins?

2025-02-09 Thread Richard Guo
On Mon, Feb 10, 2025 at 9:32 AM Tomas Vondra wrote: > E.g. imagine we have a join of 8 relations, with F (fact), dimensions D1 > and D2, and then some artibrary tables T1, T2, T3, T4, T5. And let's say > deconstruct_recurse() sees them in this particular order > > [F, T1, T2, D1, D2, T3, T4, T

Re: Virtual generated columns

2025-02-10 Thread Richard Guo
On Mon, Feb 10, 2025 at 1:16 PM Zhang Mingli wrote: > I believe virtual columns should behave like stored columns, except they > don't actually use storage. > Virtual columns are computed when the table is read, and they should adhere > to the same rules of join semantics. > I agree with Richard

Re: should we have a fast-path planning for OLTP starjoins?

2025-02-11 Thread Richard Guo
On Mon, Feb 10, 2025 at 5:35 PM Tomas Vondra wrote: > On 2/10/25 08:29, Richard Guo wrote: > > Hmm, I'm still a little concerned about whether the resulting joins > > are legal. Suppose we have a join pattern like the one below. > > > > F left join > > (

Re: Virtual generated columns

2025-02-11 Thread Richard Guo
On Tue, Feb 11, 2025 at 10:34 AM Richard Guo wrote: > Yeah, I also feel that the virtual generated columns should adhere to > outer join semantics, rather than being unconditionally replaced by > the generation expressions. But maybe I'm wrong. > > If that's the case, th

Re: Wrong results with right-semi-joins

2024-12-10 Thread Richard Guo
On Mon, Dec 9, 2024 at 11:01 PM Melanie Plageman wrote: > Thanks for finding and fixing this. Just for my own benefit, could you > explain more about the minimal repro? Specifically, if you just need a > subplan in the hash side of a right semi-join, why do you also need > the outer part of the qu

Avoid unnecessary wrapping for more complex expressions

2024-12-10 Thread Richard Guo
In commit f64ec81a8 we introduced an optimization that avoids wrapping for Vars and PHVs if they are lateral references to something outside the subquery, and the referenced rel is under the same lowest nulling outer join. It could be beneficial to get rid of such PHVs because they imply lateral d

Re: Wrong results with right-semi-joins

2024-12-11 Thread Richard Guo
On Wed, Dec 11, 2024 at 11:27 AM Richard Guo wrote: > I spent some time on this and came up with a simpler query to > reproduce the issue. > > explain (costs off) > select * from tbl_rs t1 join > lateral (select * from tbl_rs t2 where t2.a in > (select t1.a+t3.

Re: Eager aggregation, take 3

2024-12-16 Thread Richard Guo
On Wed, Dec 4, 2024 at 11:38 PM Robert Haas wrote: > On Sun, Nov 10, 2024 at 7:52 PM Richard Guo wrote: > > Hmm, currently we only consider grouped aggregation for eager > > aggregation. For grouped aggregation, the window function's > > arguments, as well as the PART

Re: Avoid unnecessary wrapping for more complex expressions

2024-12-17 Thread Richard Guo
On Wed, Dec 11, 2024 at 4:31 PM Richard Guo wrote: > As mentioned in that thread, I feel that we can apply a similar > optimization to more complex non-var expressions: if a strict > expression contains any variables of rels that are under the same > lowest nulling outer join as the

SIGSEGV in GrantLockLocal()

2024-12-17 Thread Richard Guo
Recently, I've encountered a core dump several times on master, with a backtrace like the one below. This one happened on 0f23dedc9. I was running some fuzz testing and had started around 20 sessions concurrently. (gdb) bt #0 in GrantLockLocal at lock.c:1758 #1 in GrantAwaitedLock at lock.c:18

Re: SIGSEGV in GrantLockLocal()

2024-12-18 Thread Richard Guo
On Wed, Dec 18, 2024 at 5:36 PM Heikki Linnakangas wrote: > I don't know how that can happen, but I suspect commit 3c0fd64fec > because it changed things in that area. If you can find a way to > reproduce that even sporadically, that would be very helpful! Thank you for the information. I'll try

<    3   4   5   6   7   8   9   10   >