Query plan: SELECT vs INSERT from same select

2019-07-23 Thread Alexander Voytsekhovskyy
I have quite complicated query:

SELECT axis_x1, axis_y1, SUM(delivery_price)  as v_1 FROM (
SELECT to_char(delivery_data.delivery_date, '-MM') as axis_x1,
clients.id_client as axis_y1, delivery_data.amount * production_price.price
* groups.discount as delivery_price

FROM delivery_data
JOIN client_tt ON (client_tt.id_client_tt = delivery_data.id_client_tt)
JOIN clients ON (client_tt.id_client = clients.id_client)
JOIN production ON (production.id = delivery_data.id_product)
JOIN groups ON (groups.id = delivery_data.delivery_group_id AND
client_tt.id_group = groups.id AND groups.id = clients.id_group)
LEFT JOIN production_price on (delivery_data.id_product =
production_price.id_production AND groups.price_list_id =
production_price.price_list_id AND delivery_data.delivery_date BETWEEN
production_price.date_from AND production_price.date_to)

WHERE delivery_data.delivery_date between '2019-03-01' AND '2019-06-30'
AND delivery_data.delivery_group_id IN (...short list of values...)
AND delivery_data.id_product IN ()) AS tmpsource

WHERE TRUE
GROUP BY GROUPING SETS ((axis_x1, axis_y1), (axis_x1), (axis_y1), ())

It runs well, took 1s and returns 4000 rows.

You can see explain analyze verbose here:
https://explain.depesz.com/s/AEWj

The problem is, when i wrap it to

A)
INSERT INTO norepl_1542_result (axis_x1, axis_y1, v_1)
SELECT  SAME QUERY

OR even

B)
WITH rows AS (
... SAME SELECT QUERY ...
)
INSERT INTO norepl_1542_result (axis_x1, axis_y1, v_1)
SELECT * FROM rows

The query time dramatically drops to 500+ seconds.

You can see explain analyze verbose here
https://explain.depesz.com/s/AEWj

As you can see, 100% of time goes to same SELECT query, there is no issues
with INSERT-part

I have played a lot and it's reproducing all time.

So my question is, why wrapping SELECT query with INSERT FROM SELECT
dramatically change query plan and make it 500x slower?


Re: Simple IN vs IN values performace

2021-02-23 Thread Alexander Voytsekhovskyy
Thanks for suggestion

with tmp tables there are another issue - there are already 5-6 tables and
2-3 IN filters. If i will replace them with tmp tables it may hit query
planner limits and it will become to produce terrible query plans, for
example when genetic query optimizer starts

On Tue, Feb 23, 2021 at 1:45 AM Ron  wrote:

> On 2/22/21 9:00 AM, Oleksandr Voytsekhovskyy wrote:
>
> Greetings,
>
> We have queries with IN filters with long list of INT values
>
> Sometimes, they running extremely slow, and I have found suggestion to use
> syntax
>
> Field IN (VALUES(1465), (1478), ...
>
> Instead of
>
> Field IN (1465, 1478, ...
>
> On some cases it helps, but on other it makes query running 1000+ times
> slower
>
>
> Insert the values into a TEMPORARY TABLE, then join that to your main
> table?
>
>
> So the issue is:
> - some queries much faster with simple IN
> - some queries much faster with IN + VALUES
>
> Here is explain on case when it’s extremely slow:
>
> ->  HashAggregate  (cost=5.78..9.62 rows=385 width=4)
>Group Key: ""*VALUES*"".column1"
>->  Values Scan on ""*VALUES*""  (cost=0.00..4.81 rows=385 
> width=4)"
>
> What is the right way to pass long INT values list to IN filter?
>
> I am using PostgreSQL 13.1 on Ubuntu
>
>
> --
> Angular momentum makes the world go 'round.
>


Re: Simple IN vs IN values performace

2021-02-23 Thread Alexander Voytsekhovskyy
Greetings,

i have tested both options ANY with string + parsing and simple array - and
there are cases when execution time 100+ times worse than IN

On Tue, Feb 23, 2021 at 12:23 AM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Monday, February 22, 2021, Oleksandr Voytsekhovskyy 
> wrote:
>
>> What is the right way to pass long INT values list to IN filter
>>
>
> Don’t.
>
> Pass in a delimited string, then parse that string into an array and use
> “= any(array)”.
>
> This has the primary benefit of making the input a single parameter.
>
> David J.
>
>