Forced to use UNION ALL when having multiple ANY operators and ORDER BY LIMIT

2023-06-12 Thread benoit
Hello


I have a database with few 60gb tables. Tables rows are requested with multiple 
ANY or IN operators. I am not able to find an easy way to make DB able to use 
indexes. I often hit the index, but see a a spike of 200mb of IO or disk read.


I am using version 13 but soon 14.


I wrote a reproduction script on version 14 with plans included. 
https://gist.github.com/benoittgt/ab72dc4cfedea2a0c6a5ee809d16e04d


I also have plans on a snapshot of the DB with real data.

- The current query that I try to improve : 
https://explain.dalibo.com/plan/8b8f6e0he9feb551

  - I added the DB schema + index in query view. As you can see I have many 
indexes for testing purpose and try what the planner can do.

- The optimized query when I have only one ANY and migrate to UNION ALL for 
each parameter of the ANY operator 
https://explain.dalibo.com/plan/427gg053d07328ga . Query is fast as I would 
like but it means generate some merge to be able to get a fast result.

- The new issue I have when I have a new ANY operator on the previous optimized 
query. Big IO/read https://explain.dalibo.com/plan/e7ha9g637b4eh946


It seems to me quite undoable to generate for every parameters a query that 
will then merge. I have sometimes 3-4 ANY operators with up to 15 elements in 
an array.


Is there a misusage of my indexes?

Is there a limitation when using ANY or IN operators and ordered LIMIT behind?


Thanks a lot



RE: Forced to use UNION ALL when having multiple ANY operators and ORDER BY LIMIT

2023-06-12 Thread benoit
Sadly it doesn't help to disable indexscan. The plan : 
https://explain.dalibo.com/plan/3b3gfce5b29c3hh4


De : Peter Geoghegan 
Envoyé : lundi 12 juin 2023 22:34:50
À : benoit
Cc : pgsql-performance@lists.postgresql.org
Objet : Re: Forced to use UNION ALL when having multiple ANY operators and 
ORDER BY LIMIT

On Mon, Jun 12, 2023 at 1:17 PM benoit  wrote:
> Is there a misusage of my indexes?
>
> Is there a limitation when using ANY or IN operators and ordered LIMIT behind?

It's complicated. Do you find that you get satisfactory performance if
you force a bitmap index scan? In other words, what is the effect of
"set enable_indexscan = off" on your original query? Does that speed
up execution at all? (I think that this approach ought to produce a
plan that uses a bitmap index scan in place of the index scan, without
changing anything else.)

--
Peter Geoghegan


RE: Forced to use UNION ALL when having multiple ANY operators and ORDER BY LIMIT

2023-06-12 Thread benoit
This new index is used but still the read is 230mb.


https://explain.dalibo.com/plan/b0f28a9e8a136afd



De : Chris Hoover 
Envoyé : lundi 12 juin 2023 22:55
À : benoit
Cc : pgsql-performance@lists.postgresql.org
Objet : Re: Forced to use UNION ALL when having multiple ANY operators and 
ORDER BY LIMIT

I normally create my indexes to match the where clause of the query. While 
technically, it should not matter, I find a lot of time, it does.

I would create an index on (status, sender_reference, sent_at) and see if the 
improves your query performance.

SELECT * FROM docs WHERE status IN ('draft', 'sent') AND 
sender_reference IN ('Custom/1175', 'Client/362', 'Custom/280') ORDER BY 
sent_at DESC
Thanks,


Chris Hoover
Senior DBA
AWeber.com
Cell: (803) 528-2269
Email: chr...@aweber.com



On Jun 12, 2023, at 4:17 PM, benoit  wrote:


Hello

I have a database with few 60gb tables. Tables rows are requested with multiple 
ANY or IN operators. I am not able to find an easy way to make DB able to use 
indexes. I often hit the index, but see a a spike of 200mb of IO or disk read.

I am using version 13 but soon 14.

I wrote a reproduction script on version 14 with plans included. 
https://gist.github.com/benoittgt/ab72dc4cfedea2a0c6a5ee809d16e04d

I also have plans on a snapshot of the DB with real data.
- The current query that I try to improve : 
https://explain.dalibo.com/plan/8b8f6e0he9feb551
  - I added the DB schema + index in query view. As you can see I have many 
indexes for testing purpose and try what the planner can do.
- The optimized query when I have only one ANY and migrate to UNION ALL for 
each parameter of the ANY operator 
https://explain.dalibo.com/plan/427gg053d07328ga . Query is fast as I would 
like but it means generate some merge to be able to get a fast result.
- The new issue I have when I have a new ANY operator on the previous optimized 
query. Big IO/read https://explain.dalibo.com/plan/e7ha9g637b4eh946

It seems to me quite undoable to generate for every parameters a query that 
will then merge. I have sometimes 3-4 ANY operators with up to 15 elements in 
an array.

Is there a misusage of my indexes?
Is there a limitation when using ANY or IN operators and ordered LIMIT behind?

Thanks a lot



Re: Forced to use UNION ALL when having multiple ANY operators and ORDER BY LIMIT

2023-06-19 Thread Benoit Tigeot

No it is not. But do you think there is an impact here?



Le 18/06/2023 à 23:23, msal...@msym.fr a écrit :


Hi,

Do you really need to do “select *”?

In other words, is it necessary to have all columns in the result?

/Michel SALAIS/

*De :*benoit 
*Envoyé :* lundi 12 juin 2023 23:35
*À :* Chris Hoover 
*Cc :* pgsql-performance@lists.postgresql.org
*Objet :* RE: Forced to use UNION ALL when having multiple ANY 
operators and ORDER BY LIMIT


This new index is used but still the read is 230mb.

https://explain.dalibo.com/plan/b0f28a9e8a136afd



*De :*Chris Hoover 
*Envoyé :* lundi 12 juin 2023 22:55
*À :* benoit
*Cc :* pgsql-performance@lists.postgresql.org
*Objet :* Re: Forced to use UNION ALL when having multiple ANY 
operators and ORDER BY LIMIT


I normally create my indexes to match the where clause of the query. 
While technically, it should not matter, I find a lot of time, it does.


I would create an index on (status, sender_reference, sent_at) and see 
if the improves your query performance.





SELECT * FROM docs WHEREstatus

IN('draft',

'sent')

ANDsender_reference

IN('Custom/1175',

'Client/362',

'Custom/280')

ORDER BYsent_at DESC

Thanks,

Chris Hoover

Senior DBA

AWeber.com

Cell: (803) 528-2269

Email: chr...@aweber.com



On Jun 12, 2023, at 4:17 PM, benoit  wrote:

Hello

I have a database with few 60gb tables. Tables rows are requested
with multiple ANY or IN operators. I am not able to find an easy
way to make DB able to use indexes. I often hit the index, but see
a a spike of 200mb of IO or disk read.

I am using version 13 but soon 14.

I wrote a reproduction script on version 14 with plans included.
https://gist.github.com/benoittgt/ab72dc4cfedea2a0c6a5ee809d16e04d

I also have plans on a snapshot of the DB with real data.

- The current query that I try to improve :
https://explain.dalibo.com/plan/8b8f6e0he9feb551

  - I added the DB schema + index in query view. As you can see I
have many indexes for testing purpose and try what the planner can do.

- The optimized query when I have only one ANY and migrate to
UNION ALL for each parameter of the ANY operator
https://explain.dalibo.com/plan/427gg053d07328ga . Query is fast
as I would like but it means generate some merge to be able to get
a fast result.

- The new issue I have when I have a new ANY operator on the
previous optimized query. Big IO/read
https://explain.dalibo.com/plan/e7ha9g637b4eh946

It seems to me quite undoable to generate for every parameters a
query that will then merge. I have sometimes 3-4 ANY operators
with up to 15 elements in an array.

Is there a misusage of my indexes?

Is there a limitation when using ANY or IN operators and
ordered LIMIT behind?

Thanks a lot