pá 22. 11. 2019 v 19:42 odesílatel Luís Roberto Weck < luisrobe...@siscobra.com.br> napsal:
> Em 22/11/2019 14:55, Pavel Stehule escreveu: > > > > pá 22. 11. 2019 v 18:37 odesílatel Luís Roberto Weck < > luisrobe...@siscobra.com.br> napsal: > >> Hey, >> >> I'm trying to figure out why Postgres is choosing a Hash Join over a >> Nested Loop in this query: >> >> SELECT T1.PesID, T1.PesNom, T1.PesValSal, T1.PesCPFCNPJ, T2.CarAti, >> T1.CarCod, T1.EmpCod, >> T2.CarFan, T1.PesDatAge, T1.PesCod, COALESCE( T3.PesDatAnt, DATE >> '00010101') AS PesDatAnt >> FROM ((public.Pessoa T1 >> INNER JOIN public.Carteira T2 ON T2.EmpCod = T1.EmpCod AND >> T2.CarCod = T1.CarCod) >> LEFT JOIN (SELECT MIN(COALESCE( T5.ConVenAnt, DATE >> '00010101')) AS PesDatAnt, T4.EmpCod, T4.CarCod, T4.ConPesCod AS ConPesCod >> FROM (public.Contrato T4 >> LEFT JOIN (SELECT MIN(ConParDatVen) AS >> ConVenAnt, EmpCod, CarCod, ConPesCod, ConSeq >> FROM >> public.ContratoParcela T5 >> WHERE ConParAti = true >> AND ConParValSal > 0 >> GROUP BY EmpCod, CarCod, >> ConPesCod, ConSeq ) T5 ON T5.EmpCod = T4.EmpCod AND >> >> T5.CarCod = T4.CarCod AND >> >> T5.ConPesCod = T4.ConPesCod AND >> >> T5.ConSeq = T4.ConSeq) >> WHERE T4.ConAti = TRUE >> GROUP BY T4.EmpCod, T4.CarCod, T4.ConPesCod ) >> T3 ON t3.EmpCod = T1.EmpCod AND >> >> t3.CarCod = T1.CarCod AND >> >> t3.ConPesCod = T1.PesCod) >> WHERE (T2.CarAti = true) >> AND (T1.EmpCod = 112) >> and (UPPER(T1.PesNom) like UPPER('%MARIA%')) >> ORDER BY T1.EmpCod, T1.CarCod, T1.PesCod >> >> Here the Hash Join[1] plan takes ~700ms, and if I change the first LEFT >> JOIN to a LEFT JOIN LATERAL, forcing a nested loop, the query[2] runs in >> 3ms. >> >> [1] https://explain.depesz.com/s/8IL3 >> [2] https://explain.depesz.com/s/f8Q9 >> >> > Maybe I am wrong, but probably you have to do more than just change LEFT > JOIN to LATERAL JOIN. Lateral join is based on correlated subquery - so you > had to push some predicates to subquery - and then the query can be much > more effective. > > Regards > > Pavel > > > > > >> PostgreSQL version is 11.5, I have run analyze on all the tables. >> >> PG settings: >> >> name |setting |unit| >> -------------------------------|---------|----| >> autovacuum |on | | >> default_statistics_target |250 | | >> effective_cache_size |983040 |8kB | >> effective_io_concurrency |200 | | >> max_parallel_workers |6 | | >> max_parallel_workers_per_gather|3 | | >> random_page_cost |1.1 | | >> work_mem |51200 |kB | >> > > I'm sorry, I am not sure I understood. > > This is the altered query: > > SELECT T1.PesID, T1.PesNom, T1.PesValSal, T1.PesCPFCNPJ, T2.CarAti, > T1.CarCod, T1.EmpCod, T2.CarFan, T1.PesDatAge, T1.PesCod, > COALESCE( T3.PesDatAnt, DATE '00010101') AS PesDatAnt > FROM ((public.Pessoa T1 > INNER JOIN public.Carteira T2 ON T2.EmpCod = T1.EmpCod AND > T2.CarCod = T1.CarCod) > LEFT JOIN *LATERAL *(SELECT MIN(COALESCE( T5.ConVenAnt, DATE > '00010101')) AS PesDatAnt, T4.EmpCod, T4.CarCod, T4.ConPesCod AS ConPesCod > FROM (public.Contrato T4 > LEFT JOIN (SELECT > MIN(ConParDatVen) AS ConVenAnt, EmpCod, CarCod, ConPesCod, ConSeq > FROM > public.ContratoParcela T5 > WHERE ConParAti = true > and ConParValSal > 0 > > GROUP BY EmpCod, > CarCod, ConPesCod, ConSeq ) T5 ON T5.EmpCod = T4.EmpCod AND T5.CarCod = > T4.CarCod AND T5.ConPesCod = T4.ConPesCod AND T5.ConSeq = T4.ConSeq) > WHERE T4.ConAti = TRUE > *AND t4.EmpCod = T1.EmpCod AND > t4.CarCod = T1.CarCod AND t4.ConPesCod = T1.PesCod* > GROUP BY T4.EmpCod, T4.CarCod, > T4.ConPesCod ) T3 ON *TRUE ) --ON t3.EmpCod = T1.EmpCod AND t3.CarCod = > T1.CarCod AND t3.ConPesCod = T1.PesCod) * > WHERE (T2.CarAti = true) > AND (T1.EmpCod = 112) > and (UPPER(T1.PesNom) like UPPER('%MARIA%')) > ORDER BY T1.EmpCod, T1.CarCod, T1.PesCod > > In bold are the changes I've made to the query. I am sure PostgreSQL is > able to push it down, since it is much faster now. The problem I have is > that this is a query generated by an ORM, So I can't change it. > > I would like to understand why wasn't Postgres able to optimize it to a > nested loop. Is there something I can do with the statistics? > I don't think. Postgres optimizer just doesn't support this optimization. It has sense only when you know so number of loops is very small - else nested loop should be much slower. Regards Pavel > Thanks!! > >