Query plan: SELECT vs INSERT from same select
I have quite complicated query: SELECT axis_x1, axis_y1, SUM(delivery_price) as v_1 FROM ( SELECT to_char(delivery_data.delivery_date, '-MM') as axis_x1, clients.id_client as axis_y1, delivery_data.amount * production_price.price * groups.discount as delivery_price FROM delivery_data JOIN client_tt ON (client_tt.id_client_tt = delivery_data.id_client_tt) JOIN clients ON (client_tt.id_client = clients.id_client) JOIN production ON (production.id = delivery_data.id_product) JOIN groups ON (groups.id = delivery_data.delivery_group_id AND client_tt.id_group = groups.id AND groups.id = clients.id_group) LEFT JOIN production_price on (delivery_data.id_product = production_price.id_production AND groups.price_list_id = production_price.price_list_id AND delivery_data.delivery_date BETWEEN production_price.date_from AND production_price.date_to) WHERE delivery_data.delivery_date between '2019-03-01' AND '2019-06-30' AND delivery_data.delivery_group_id IN (...short list of values...) AND delivery_data.id_product IN ()) AS tmpsource WHERE TRUE GROUP BY GROUPING SETS ((axis_x1, axis_y1), (axis_x1), (axis_y1), ()) It runs well, took 1s and returns 4000 rows. You can see explain analyze verbose here: https://explain.depesz.com/s/AEWj The problem is, when i wrap it to A) INSERT INTO norepl_1542_result (axis_x1, axis_y1, v_1) SELECT SAME QUERY OR even B) WITH rows AS ( ... SAME SELECT QUERY ... ) INSERT INTO norepl_1542_result (axis_x1, axis_y1, v_1) SELECT * FROM rows The query time dramatically drops to 500+ seconds. You can see explain analyze verbose here https://explain.depesz.com/s/AEWj As you can see, 100% of time goes to same SELECT query, there is no issues with INSERT-part I have played a lot and it's reproducing all time. So my question is, why wrapping SELECT query with INSERT FROM SELECT dramatically change query plan and make it 500x slower?
Re: Simple IN vs IN values performace
Thanks for suggestion with tmp tables there are another issue - there are already 5-6 tables and 2-3 IN filters. If i will replace them with tmp tables it may hit query planner limits and it will become to produce terrible query plans, for example when genetic query optimizer starts On Tue, Feb 23, 2021 at 1:45 AM Ron wrote: > On 2/22/21 9:00 AM, Oleksandr Voytsekhovskyy wrote: > > Greetings, > > We have queries with IN filters with long list of INT values > > Sometimes, they running extremely slow, and I have found suggestion to use > syntax > > Field IN (VALUES(1465), (1478), ... > > Instead of > > Field IN (1465, 1478, ... > > On some cases it helps, but on other it makes query running 1000+ times > slower > > > Insert the values into a TEMPORARY TABLE, then join that to your main > table? > > > So the issue is: > - some queries much faster with simple IN > - some queries much faster with IN + VALUES > > Here is explain on case when it’s extremely slow: > > -> HashAggregate (cost=5.78..9.62 rows=385 width=4) >Group Key: ""*VALUES*"".column1" >-> Values Scan on ""*VALUES*"" (cost=0.00..4.81 rows=385 > width=4)" > > What is the right way to pass long INT values list to IN filter? > > I am using PostgreSQL 13.1 on Ubuntu > > > -- > Angular momentum makes the world go 'round. >
Re: Simple IN vs IN values performace
Greetings, i have tested both options ANY with string + parsing and simple array - and there are cases when execution time 100+ times worse than IN On Tue, Feb 23, 2021 at 12:23 AM David G. Johnston < david.g.johns...@gmail.com> wrote: > On Monday, February 22, 2021, Oleksandr Voytsekhovskyy > wrote: > >> What is the right way to pass long INT values list to IN filter >> > > Don’t. > > Pass in a delimited string, then parse that string into an array and use > “= any(array)”. > > This has the primary benefit of making the input a single parameter. > > David J. > >