Re: increasing effective_cache_size slows down join queries by a factor of 4000x

2022-02-15 Thread Tomas Vondra
On 2/15/22 01:06, A Shaposhnikov wrote: Interestingly I have a second PG 14.2 database, with identical table definitions, but about 10% smaller row counts, and the exact same query works fast there without the 2nd condition: Are you sure about the 10%? Because in the plans from the firs

Re: increasing effective_cache_size slows down join queries by a factor of 4000x

2022-02-14 Thread Michael Lewis
I think that theoretically if the planner sees a condition like a.column1 > constant_value, and it also has a condition like a.column1 = b.column2 then it could autogenerate the b.column2 > constant_value condition. And of course > could be <, <=, >= and <> But I could be wrong, particularly with

Re: increasing effective_cache_size slows down join queries by a factor of 4000x

2022-02-14 Thread A Shaposhnikov
I started using the latest postgres 14.2 and the query plans there for a simplified query joining just 2 tables look like: explain analyze select t.*, ta.* from team as t, team_aliases as ta where ta.team_id = t.id and t.id > 2 order by t.id limit 1000; QUERY PLAN -

Re: increasing effective_cache_size slows down join queries by a factor of 4000x

2022-02-04 Thread Tomas Vondra
On 2/4/22 05:21, A Shaposhnikov wrote: Tomas, thank you! The query: select 1 from data as d, data_class as dc where dc.data_id = d.id and d.id > 205284974 and dc.data_id > 205284974 -- new condition order by d.id limit 1000; totally solved it - it is now fast under all

Re: increasing effective_cache_size slows down join queries by a factor of 4000x

2022-02-03 Thread A Shaposhnikov
Tomas, thank you! The query: select 1 from data as d, data_class as dc where dc.data_id = d.id and d.id > 205284974 and dc.data_id > 205284974 -- new condition order by d.id limit 1000; totally solved it - it is now fast under all conditions! I thought that the optimizer woul

Re: increasing effective_cache_size slows down join queries by a factor of 4000x

2022-02-03 Thread Michael Lewis
I would guess that autovacuum is analyzing the table and causing the stats to change which is resulting in different estimates that result in different plans. Unless you can get the estimate much more accurate, you won't get far with expecting a stable plan that performs well. How is data_class_pk

Re: increasing effective_cache_size slows down join queries by a factor of 4000x

2022-02-03 Thread A Shaposhnikov
the random_page_cost = 1.2 and seq_page_cost= 1 in my setup are the same. I also run the vacuum analyze on all tables. I have no idea why the optimizer thinks that the merge on 2 200M row tables is faster than 50 index lookups, other than it "thinks" that in-memory merges are very low cost regardle

Re: increasing effective_cache_size slows down join queries by a factor of 4000x

2022-02-03 Thread Vijaykumar Jain
On Fri, 4 Feb 2022 at 01:03, A Shaposhnikov wrote: > I made a mistake yesterday claiming that the created statistics > changed the row counts in the estimates - it did not - I looked at > the wrong query yesterday. In the correct query plan the row estimate > still differs from the actual by man

Re: increasing effective_cache_size slows down join queries by a factor of 4000x

2022-02-03 Thread A Shaposhnikov
I made a mistake yesterday claiming that the created statistics changed the row counts in the estimates - it did not - I looked at the wrong query yesterday. In the correct query plan the row estimate still differs from the actual by many orders of magnitude: Nested Loop (cost=1.01..27584834.53

Re: increasing effective_cache_size slows down join queries by a factor of 4000x

2022-02-02 Thread Tomas Vondra
On 2/2/22 22:10, Artyom Shaposhnikov wrote: the row estimate became ~1000x smaller with the stat in place, so it looks like it grossly miscalculates the query plans without the stats for large tables representing M:M relations. Well, if the estimates are significantly off (and 3 orders of magn

Re: increasing effective_cache_size slows down join queries by a factor of 4000x

2022-02-02 Thread Imre Samu
> Question: are there other ways to give PostgreSQL a hint What you your pg version? Maybe with pg_hint_plan extension ( https://github.com/ossc-db/pg_hint_plan ) "pg_hint_plan makes it possible to tweak PostgreSQL execution plans using so-called "hints" in SQL comments, like /*+ SeqScan(a) */."

Re: increasing effective_cache_size slows down join queries by a factor of 4000x

2022-02-02 Thread Artyom Shaposhnikov
the row estimate became ~1000x smaller with the stat in place, so it looks like it grossly miscalculates the query plans without the stats for large tables representing M:M relations. On Wed, Feb 2, 2022 at 11:47 AM Michael Lewis wrote: > > What does the row estimate look like on the scan of data

Re: increasing effective_cache_size slows down join queries by a factor of 4000x

2022-02-02 Thread Michael Lewis
What does the row estimate look like on the scan of data table with that statistic in place? Anytime the stats give a mis-estimate this far off, I wouldn't expect that plans would be optimal except by luck. Index Scan using data_pkey on data t (cost=0.57..21427806.53 *rows=58785023* width=131) (ac

increasing effective_cache_size slows down join queries by a factor of 4000x

2022-02-02 Thread Artyom Shaposhnikov
I recently moved a postgres DB to a more powerful server with 1TB of RAM instead of 64GB before. To my surprise after running the tuning on the new server, the performance of joins deteriorated by 4000x compared to the old server. I carefully examined all of the changes and found the culprit: if I