út 20. 10. 2020 v 13:09 odesílatel Mats Julian Olsen <m...@duneanalytics.com> napsal:
> > > On Tue, Oct 20, 2020 at 10:50 AM Pavel Stehule <pavel.steh...@gmail.com> > wrote: > >> >> >> út 20. 10. 2020 v 11:59 odesílatel Mats Julian Olsen < >> m...@duneanalytics.com> napsal: >> >>> On Tue, Oct 20, 2020 at 9:50 AM David Rowley <dgrowle...@gmail.com> >>> wrote: >>> >>>> On Tue, 20 Oct 2020 at 22:38, Mats Julian Olsen <m...@duneanalytics.com> >>>> wrote: >>>> > >>>> > The crux of our issue is that the query planner chooses a nested loop >>>> join for this query. Essentially making this query (and other queries) take >>>> a very long time to complete. In contrast, by toggling `enable_nestloop` >>>> and `enable_seqscan` off we can take the total runtime down from 16 minutes >>>> to 2 minutes. >>>> > >>>> > 1) Vanilla plan (16 min) : https://explain.depesz.com/s/NvDR >>>> > 2) enable_nestloop=off (4 min): https://explain.depesz.com/s/buKK >>>> > 3) enable_nestloop=off; enable_seqscan=off (2 min): >>>> https://explain.depesz.com/s/0WXx >>>> > >>>> > How can I get Postgres not to loop over 12M rows? >>>> >>>> You'll likely want to look at what random_page_cost is set to. If the >>>> planner is preferring nested loops then it may be too low. You'll >>>> also want to see if effective_cache_size is set to something >>>> realistic. Higher values of that will prefer nested loops like this. >>>> >>> >>> random_page_cost is 1.1 and effective_cache_size is '60GB' (listed in >>> the gist). random_page_cost may be too low? >>> >> >> random_page_cost 2 is safer - the value 1.5 is a little bit aggressive >> for me. >> > > Thanks Pavel. I tried changing random_page_cost from 1.1 to 2, to 3... all > the way up to 10. All values resulted in the same query plan, except for > 10, which then executed a parallel hash join (however with sequential > scans) https://explain.depesz.com/s/Srcb. > > 10 seems like a way too high value for random_page_cost though? > it is not usual, but I know about analytics cases where is this value. But maybe effective_cache_size is too high. > > >> >>> >>>> You may also want to reduce max_parallel_workers_per_gather. It looks >>>> like you're not getting your parallel workers as often as you'd like. >>>> If the planner chooses a plan thinking it's going to get some workers >>>> and gets none, then that plan may be inferior the one that the planner >>>> would have chosen if it had known the workers would be unavailable. >>>> >>> >>> Interesting, here are the values for those: >>> max_parallel_workers = 8 >>> max_parallel_workers_per_gather = 4 >>> >>> >>>> >>>> > Let me know if there is anything I left out here that would be useful >>>> for further debugging. >>>> >>>> select name,setting from pg_Settings where category like 'Query >>>> Tuning%' and source <> 'default'; >>>> select version(); >>>> >>> >>> default_statistics_target = 500 >>> effective_cache_size = 7864320 >>> random_page_cost = 1.1 >>> >>> PostgreSQL 12.2 (Ubuntu 12.2-2.pgdg19.10+1) on x86_64-pc-linux-gnu, >>> compiled by gcc (Ubuntu 9.2.1-9ubuntu2) 9.2.1 20191008, 64-bit >>> >>>> >>>> would be useful. >>>> >>>> David >>>> >>> >>> Thanks David, see above for more information. >>> >>> -- >>> Mats >>> CTO @ Dune Analytics >>> We're hiring: https://careers.duneanalytics.com >>> >> > > -- > Mats > CTO @ Dune Analytics > We're hiring: https://careers.duneanalytics.com >