On Tue, Oct 20, 2020 at 11:16 AM Pavel Stehule <pavel.steh...@gmail.com> wrote:
> > > út 20. 10. 2020 v 13:09 odesílatel Mats Julian Olsen < > m...@duneanalytics.com> napsal: > >> >> >> On Tue, Oct 20, 2020 at 10:50 AM Pavel Stehule <pavel.steh...@gmail.com> >> wrote: >> >>> >>> >>> út 20. 10. 2020 v 11:59 odesílatel Mats Julian Olsen < >>> m...@duneanalytics.com> napsal: >>> >>>> On Tue, Oct 20, 2020 at 9:50 AM David Rowley <dgrowle...@gmail.com> >>>> wrote: >>>> >>>>> On Tue, 20 Oct 2020 at 22:38, Mats Julian Olsen < >>>>> m...@duneanalytics.com> wrote: >>>>> > >>>>> > The crux of our issue is that the query planner chooses a nested >>>>> loop join for this query. Essentially making this query (and other >>>>> queries) >>>>> take a very long time to complete. In contrast, by toggling >>>>> `enable_nestloop` and `enable_seqscan` off we can take the total runtime >>>>> down from 16 minutes to 2 minutes. >>>>> > >>>>> > 1) Vanilla plan (16 min) : https://explain.depesz.com/s/NvDR >>>>> > 2) enable_nestloop=off (4 min): https://explain.depesz.com/s/buKK >>>>> > 3) enable_nestloop=off; enable_seqscan=off (2 min): >>>>> https://explain.depesz.com/s/0WXx >>>>> > >>>>> > How can I get Postgres not to loop over 12M rows? >>>>> >>>>> You'll likely want to look at what random_page_cost is set to. If the >>>>> planner is preferring nested loops then it may be too low. You'll >>>>> also want to see if effective_cache_size is set to something >>>>> realistic. Higher values of that will prefer nested loops like this. >>>>> >>>> >>>> random_page_cost is 1.1 and effective_cache_size is '60GB' (listed in >>>> the gist). random_page_cost may be too low? >>>> >>> >>> random_page_cost 2 is safer - the value 1.5 is a little bit aggressive >>> for me. >>> >> >> Thanks Pavel. I tried changing random_page_cost from 1.1 to 2, to 3... >> all the way up to 10. All values resulted in the same query plan, except >> for 10, which then executed a parallel hash join (however with sequential >> scans) https://explain.depesz.com/s/Srcb. >> >> 10 seems like a way too high value for random_page_cost though? >> > > it is not usual, but I know about analytics cases where is this value. But > maybe effective_cache_size is too high. > Changing the effective_cache_size from 10GB up to 60GB does not affect the Nested Loop-part of this query plan. It does alter the inner part of a loop from sequential (low cache) to index scans (high cache). > >> >>> >>>> >>>>> You may also want to reduce max_parallel_workers_per_gather. It looks >>>>> like you're not getting your parallel workers as often as you'd like. >>>>> If the planner chooses a plan thinking it's going to get some workers >>>>> and gets none, then that plan may be inferior the one that the planner >>>>> would have chosen if it had known the workers would be unavailable. >>>>> >>>> >>>> Interesting, here are the values for those: >>>> max_parallel_workers = 8 >>>> max_parallel_workers_per_gather = 4 >>>> >>>> >>>>> >>>>> > Let me know if there is anything I left out here that would be >>>>> useful for further debugging. >>>>> >>>>> select name,setting from pg_Settings where category like 'Query >>>>> Tuning%' and source <> 'default'; >>>>> select version(); >>>>> >>>> >>>> default_statistics_target = 500 >>>> effective_cache_size = 7864320 >>>> random_page_cost = 1.1 >>>> >>>> PostgreSQL 12.2 (Ubuntu 12.2-2.pgdg19.10+1) on x86_64-pc-linux-gnu, >>>> compiled by gcc (Ubuntu 9.2.1-9ubuntu2) 9.2.1 20191008, 64-bit >>>> >>>>> >>>>> would be useful. >>>>> >>>>> David >>>>> >>>> >>>> Thanks David, see above for more information. >>>> >>>> -- >>>> Mats >>>> CTO @ Dune Analytics >>>> We're hiring: https://careers.duneanalytics.com >>>> >>> >> >> -- >> Mats >> CTO @ Dune Analytics >> We're hiring: https://careers.duneanalytics.com >> > -- Mats CTO @ Dune Analytics We're hiring: https://careers.duneanalytics.com