Forced to use UNION ALL when having multiple ANY operators and ORDER BY LIMIT
Hello I have a database with few 60gb tables. Tables rows are requested with multiple ANY or IN operators. I am not able to find an easy way to make DB able to use indexes. I often hit the index, but see a a spike of 200mb of IO or disk read. I am using version 13 but soon 14. I wrote a reproduction script on version 14 with plans included. https://gist.github.com/benoittgt/ab72dc4cfedea2a0c6a5ee809d16e04d I also have plans on a snapshot of the DB with real data. - The current query that I try to improve : https://explain.dalibo.com/plan/8b8f6e0he9feb551 - I added the DB schema + index in query view. As you can see I have many indexes for testing purpose and try what the planner can do. - The optimized query when I have only one ANY and migrate to UNION ALL for each parameter of the ANY operator https://explain.dalibo.com/plan/427gg053d07328ga . Query is fast as I would like but it means generate some merge to be able to get a fast result. - The new issue I have when I have a new ANY operator on the previous optimized query. Big IO/read https://explain.dalibo.com/plan/e7ha9g637b4eh946 It seems to me quite undoable to generate for every parameters a query that will then merge. I have sometimes 3-4 ANY operators with up to 15 elements in an array. Is there a misusage of my indexes? Is there a limitation when using ANY or IN operators and ordered LIMIT behind? Thanks a lot
RE: Forced to use UNION ALL when having multiple ANY operators and ORDER BY LIMIT
Sadly it doesn't help to disable indexscan. The plan : https://explain.dalibo.com/plan/3b3gfce5b29c3hh4 De : Peter Geoghegan Envoyé : lundi 12 juin 2023 22:34:50 À : benoit Cc : pgsql-performance@lists.postgresql.org Objet : Re: Forced to use UNION ALL when having multiple ANY operators and ORDER BY LIMIT On Mon, Jun 12, 2023 at 1:17 PM benoit wrote: > Is there a misusage of my indexes? > > Is there a limitation when using ANY or IN operators and ordered LIMIT behind? It's complicated. Do you find that you get satisfactory performance if you force a bitmap index scan? In other words, what is the effect of "set enable_indexscan = off" on your original query? Does that speed up execution at all? (I think that this approach ought to produce a plan that uses a bitmap index scan in place of the index scan, without changing anything else.) -- Peter Geoghegan
RE: Forced to use UNION ALL when having multiple ANY operators and ORDER BY LIMIT
This new index is used but still the read is 230mb. https://explain.dalibo.com/plan/b0f28a9e8a136afd De : Chris Hoover Envoyé : lundi 12 juin 2023 22:55 À : benoit Cc : pgsql-performance@lists.postgresql.org Objet : Re: Forced to use UNION ALL when having multiple ANY operators and ORDER BY LIMIT I normally create my indexes to match the where clause of the query. While technically, it should not matter, I find a lot of time, it does. I would create an index on (status, sender_reference, sent_at) and see if the improves your query performance. SELECT * FROM docs WHERE status IN ('draft', 'sent') AND sender_reference IN ('Custom/1175', 'Client/362', 'Custom/280') ORDER BY sent_at DESC Thanks, Chris Hoover Senior DBA AWeber.com Cell: (803) 528-2269 Email: chr...@aweber.com On Jun 12, 2023, at 4:17 PM, benoit wrote: Hello I have a database with few 60gb tables. Tables rows are requested with multiple ANY or IN operators. I am not able to find an easy way to make DB able to use indexes. I often hit the index, but see a a spike of 200mb of IO or disk read. I am using version 13 but soon 14. I wrote a reproduction script on version 14 with plans included. https://gist.github.com/benoittgt/ab72dc4cfedea2a0c6a5ee809d16e04d I also have plans on a snapshot of the DB with real data. - The current query that I try to improve : https://explain.dalibo.com/plan/8b8f6e0he9feb551 - I added the DB schema + index in query view. As you can see I have many indexes for testing purpose and try what the planner can do. - The optimized query when I have only one ANY and migrate to UNION ALL for each parameter of the ANY operator https://explain.dalibo.com/plan/427gg053d07328ga . Query is fast as I would like but it means generate some merge to be able to get a fast result. - The new issue I have when I have a new ANY operator on the previous optimized query. Big IO/read https://explain.dalibo.com/plan/e7ha9g637b4eh946 It seems to me quite undoable to generate for every parameters a query that will then merge. I have sometimes 3-4 ANY operators with up to 15 elements in an array. Is there a misusage of my indexes? Is there a limitation when using ANY or IN operators and ordered LIMIT behind? Thanks a lot
Re: Forced to use UNION ALL when having multiple ANY operators and ORDER BY LIMIT
No it is not. But do you think there is an impact here? Le 18/06/2023 à 23:23, msal...@msym.fr a écrit : Hi, Do you really need to do “select *”? In other words, is it necessary to have all columns in the result? /Michel SALAIS/ *De :*benoit *Envoyé :* lundi 12 juin 2023 23:35 *À :* Chris Hoover *Cc :* pgsql-performance@lists.postgresql.org *Objet :* RE: Forced to use UNION ALL when having multiple ANY operators and ORDER BY LIMIT This new index is used but still the read is 230mb. https://explain.dalibo.com/plan/b0f28a9e8a136afd *De :*Chris Hoover *Envoyé :* lundi 12 juin 2023 22:55 *À :* benoit *Cc :* pgsql-performance@lists.postgresql.org *Objet :* Re: Forced to use UNION ALL when having multiple ANY operators and ORDER BY LIMIT I normally create my indexes to match the where clause of the query. While technically, it should not matter, I find a lot of time, it does. I would create an index on (status, sender_reference, sent_at) and see if the improves your query performance. SELECT * FROM docs WHEREstatus IN('draft', 'sent') ANDsender_reference IN('Custom/1175', 'Client/362', 'Custom/280') ORDER BYsent_at DESC Thanks, Chris Hoover Senior DBA AWeber.com Cell: (803) 528-2269 Email: chr...@aweber.com On Jun 12, 2023, at 4:17 PM, benoit wrote: Hello I have a database with few 60gb tables. Tables rows are requested with multiple ANY or IN operators. I am not able to find an easy way to make DB able to use indexes. I often hit the index, but see a a spike of 200mb of IO or disk read. I am using version 13 but soon 14. I wrote a reproduction script on version 14 with plans included. https://gist.github.com/benoittgt/ab72dc4cfedea2a0c6a5ee809d16e04d I also have plans on a snapshot of the DB with real data. - The current query that I try to improve : https://explain.dalibo.com/plan/8b8f6e0he9feb551 - I added the DB schema + index in query view. As you can see I have many indexes for testing purpose and try what the planner can do. - The optimized query when I have only one ANY and migrate to UNION ALL for each parameter of the ANY operator https://explain.dalibo.com/plan/427gg053d07328ga . Query is fast as I would like but it means generate some merge to be able to get a fast result. - The new issue I have when I have a new ANY operator on the previous optimized query. Big IO/read https://explain.dalibo.com/plan/e7ha9g637b4eh946 It seems to me quite undoable to generate for every parameters a query that will then merge. I have sometimes 3-4 ANY operators with up to 15 elements in an array. Is there a misusage of my indexes? Is there a limitation when using ANY or IN operators and ordered LIMIT behind? Thanks a lot