Re: query plan

2023-11-17 Thread Tom Lane
=?UTF-8?Q?Torsten_F=C3=B6rtsch?= writes: > This is part of a query plan: > Nested Loop Left Join (cost=26.32..47078866.36 rows=1344945195 width=626) >-> Nested Loop Left Join (cost=25.74..5312.48 rows=1344945195 > width=608) > -> Nested Loop Left Join (cost=6.79..2876.77 rows=1

Re: Query plan regression between CTE and views

2023-08-15 Thread David Gilman
I'm on PostgreSQL 15 with essentially a stock configuration. On Tue, Aug 15, 2023 at 8:58 AM Ron wrote: > > On 8/14/23 09:54, David Gilman wrote: > > I have a query that was originally written as a handful of CTEs out of > > convenience. It is producing a reasonable query plan because the CTE > >

Re: Query plan regression between CTE and views

2023-08-14 Thread Ron
On 8/14/23 09:54, David Gilman wrote: I have a query that was originally written as a handful of CTEs out of convenience. It is producing a reasonable query plan because the CTE materialization was kicking in at an appropriate place. The CTEs aren't totally linear. The graph looks like this, wher

Re: Query plan for "id IS NULL" on PK

2023-02-17 Thread Ben Chrobot
Thank you all for your responses! I will continue to put pressure on the vendor (Stitch Data, if anyone knows folks there) to address the issue on their end with the query being issued. Best, Ben Chrobot On Tue, Feb 14, 2023 at 11:11 PM Tom Lane wrote: > David Rowley writes: > > On Wed, 15 F

Re: Query plan for "id IS NULL" on PK

2023-02-15 Thread Ron
On 2/14/23 18:21, David Rowley wrote: [snip] since it likely only applies to nearly zero real-world cases Are you sure? -- Born in Arizona, moved to Babylonia.

Re: Query plan for "id IS NULL" on PK

2023-02-14 Thread Tom Lane
David Rowley writes: > On Wed, 15 Feb 2023 at 11:39, Peter J. Holzer wrote: >> OTOH it could also be argued that the optimizer should be able to >> perform the same simplifications as I did above and produce the same >> code for WHERE (("id" > ? OR "id" IS NULL)) AND (("id" <= ?)) >> as for WHERE

Re: Query plan for "id IS NULL" on PK

2023-02-14 Thread David Rowley
On Wed, 15 Feb 2023 at 11:39, Peter J. Holzer wrote: > OTOH it could also be argued that the optimizer should be able to > perform the same simplifications as I did above and produce the same > code for WHERE (("id" > ? OR "id" IS NULL)) AND (("id" <= ?)) > as for WHERE (("id" > ?)) AND (("id" <=

Re: Query plan for "id IS NULL" on PK

2023-02-14 Thread Rob Sargent
On 2/14/23 15:43, Peter J. Holzer wrote: On 2023-02-14 15:36:32 -0700, Rob Sargent wrote: But if the query is supposed to be generic and re-used in a situation where id could be null, wouldn't the null id records be fetched every time? No, they will never be fetched because of the AND (("id" <=

Re: Query plan for "id IS NULL" on PK

2023-02-14 Thread Peter J. Holzer
On 2023-02-14 15:36:32 -0700, Rob Sargent wrote: > But if the query is supposed to be generic and re-used in a situation where id > could be null, wouldn't the null id records be fetched every time?  No, they will never be fetched because of the AND (("id" <= ?)). hp -- _ | Peter J.

Re: Query plan for "id IS NULL" on PK

2023-02-14 Thread Peter J. Holzer
On 2023-02-14 17:04:51 -0500, Ben Chrobot wrote: > We have a large table (~470 million rows) with integer primary key id (not > null) on a Postgres 14.5 cluster. A third-party tool is attempting to perform > a > SELECT-based full table copy in preparation for log-based sync with a query > like the

Re: Query plan for "id IS NULL" on PK

2023-02-14 Thread Rob Sargent
On 2/14/23 15:30, David G. Johnston wrote: On Tue, Feb 14, 2023 at 3:25 PM Rob Sargent wrote: When will id be null in a primary key? The OP seems to be aware of this... "We cannot change the query being executed. Is there any way we can make the query planner ignore `OR (id IS NULL)` (

Re: Query plan for "id IS NULL" on PK

2023-02-14 Thread David G. Johnston
On Tue, Feb 14, 2023 at 3:25 PM Rob Sargent wrote: > > When will id be null in a primary key? > > The OP seems to be aware of this... "We cannot change the query being executed. Is there any way we can make the query planner ignore `OR (id IS NULL)` (as that will never be the case for the PK) an

Re: Query plan for "id IS NULL" on PK

2023-02-14 Thread Rob Sargent
On 2/14/23 15:04, Ben Chrobot wrote: Hello, Long time listener, first time caller. We have a large table (~470 million rows) with integer primary key id (not null) on a Postgres 14.5 cluster. A third-party tool is attempting to perform a SELECT-based full table copy in preparation for log-ba

Re: Query plan prefers hash join when nested loop is much faster

2020-08-25 Thread iulian dragos
On Tue, Aug 25, 2020 at 12:36 PM David Rowley wrote: > On Tue, 25 Aug 2020 at 22:10, iulian dragos > wrote: > > Thanks for the tip! Indeed, `n_distinct` isn't right. I found it in > pg_stats set at 131736.0, but the actual number is much higher: 210104361. > I tried to set it manually, but the p

Re: Query plan prefers hash join when nested loop is much faster

2020-08-25 Thread David Rowley
On Tue, 25 Aug 2020 at 22:10, iulian dragos wrote: > Thanks for the tip! Indeed, `n_distinct` isn't right. I found it in pg_stats > set at 131736.0, but the actual number is much higher: 210104361. I tried to > set it manually, but the plan is still the same (both the actual number and a > perc

Re: Query plan prefers hash join when nested loop is much faster

2020-08-25 Thread iulian dragos
On Tue, Aug 25, 2020 at 12:27 AM David Rowley wrote: > On Sat, 22 Aug 2020 at 00:35, iulian dragos > wrote: > > I am trying to understand why the query planner insists on using a hash > join, and how to make it choose the better option, which in this case would > be a nested loop. > > > |

Re: Query plan prefers hash join when nested loop is much faster

2020-08-24 Thread David Rowley
On Sat, 22 Aug 2020 at 00:35, iulian dragos wrote: > I am trying to understand why the query planner insists on using a hash join, > and how to make it choose the better option, which in this case would be a > nested loop. > | -> Index Scan using > test_result_module

Re: Query plan prefers hash join when nested loop is much faster

2020-08-24 Thread iulian dragos
On Mon, Aug 24, 2020 at 4:21 PM iulian dragos wrote: > Hi Michael, > > Thanks for the answer. It's an RDS instance using SSD storage and the > default `random_page_cost` set to 4.0. I don't expect a lot of repetitive > queries here, so I think caching may not be extremely useful. I wonder if > th

Re: Query plan prefers hash join when nested loop is much faster

2020-08-24 Thread iulian dragos
Hi Michael, Thanks for the answer. It's an RDS instance using SSD storage and the default `random_page_cost` set to 4.0. I don't expect a lot of repetitive queries here, so I think caching may not be extremely useful. I wonder if the selectivity of the query is wrongly estimated (out of 500 millio

Re: Query plan prefers hash join when nested loop is much faster

2020-08-21 Thread Michael Lewis
Your system is preferring sequential scan to using test_result_module_result_id_idx in this case. What type of storage do you use, what type of cache hits do you expect, and what do you have random_page_cost set to? That comes to mind as a significant factor in choosing index scans based on costs.

Re: Query plan: SELECT vs INSERT from same select

2019-07-24 Thread Alban Hertroys
> On 23 Jul 2019, at 22:29, Alexander Voytsekhovskyy > wrote: > > I have quite complicated query: > > SELECT axis_x1, axis_y1, SUM(delivery_price) as v_1 FROM ( > SELECT to_char(delivery_data.delivery_date, '-MM') as axis_x1, > clients.id_client as axis_y1, delivery_data.amount * produc

Re: Query plan: SELECT vs INSERT from same select

2019-07-23 Thread Tom Lane
[ please keep the list cc'd ] Alexander Voytsekhovskyy writes: > Sorry again > here is both links: > https://explain.depesz.com/s/AEWj > https://explain.depesz.com/s/CHwF Don't think I believe that those are the same query --- there's a CTE in the second one that doesn't appear in the first, and

Re: Query plan: SELECT vs INSERT from same select

2019-07-23 Thread Tom Lane
Alexander Voytsekhovskyy writes: > You can see explain analyze verbose here: > https://explain.depesz.com/s/AEWj > The problem is, when i wrap it to > A) > INSERT INTO norepl_1542_result (axis_x1, axis_y1, v_1) > SELECT SAME QUERY > OR even > B) > WITH rows AS ( > ... SAME SELECT QUERY ..

Re: Query plan: SELECT vs INSERT from same select

2019-07-23 Thread Igor Korot
Hi, On Tue, Jul 23, 2019 at 3:29 PM Alexander Voytsekhovskyy wrote: > > I have quite complicated query: > > SELECT axis_x1, axis_y1, SUM(delivery_price) as v_1 FROM ( > SELECT to_char(delivery_data.delivery_date, '-MM') as axis_x1, > clients.id_client as axis_y1, delivery_data.amount * prod