Re: Possible optimisation: push down SORT and LIMIT nodes

Rui DeSousa Fri, 01 Jun 2018 10:23:20 -0700

True… but in that case it needs to be more expressive.

i.e.


with d as (
  select date
  from inventory
  order by date
  limit 10
), df as (
  select distinct date
  from d
)
select i.date, a.name, i.quantity
from inventory i
join asset a on a.id = i.id_asset
where i.date in (select date from df)
order by i.date, a.name
limit 10
;  


prod=>  with d as (
prod(>   select date
prod(>   from inventory
prod(>   order by date
prod(>   limit 10
prod(> ), df as (
prod(>   select distinct date
prod(>   from d
prod(> )
prod-> select i.date, a.name, i.quantity
prod-> from inventory i
prod-> join asset a on a.id = i.id_asset
prod-> where i.date in (select date from df)
prod-> order by i.date, a.name
prod-> limit 10
prod-> ; 
    date    |           name            | quantity 
------------+---------------------------+----------
 2004-09-22 | Thing 0.00122669106349349 |        0
 2004-09-22 | Thing 0.00140673760324717 |        0
 2004-09-22 | Thing 0.00180063676089048 |        0
 2004-09-22 | Thing 0.00463481899350882 |        1
 2004-09-22 | Thing 0.00622459733858705 |        1
 2004-09-22 | Thing 0.00649207830429077 |        0
 2004-09-22 | Thing 0.00823836214840412 |        1
 2004-09-22 | Thing 0.0109024560078979  |        1
 2004-09-22 | Thing 0.0109436474740505  |        0
 2004-09-22 | Thing 0.0111544523388147  |        0
(10 rows)

Time: 3.040 ms
prod=> select inventory.date, asset.name, inventory.quantity
prod-> from asset 
prod-> join inventory on id_asset = asset.id
prod-> order by inventory.date, asset.name
prod-> limit 10;
    date    |           name            | quantity 
------------+---------------------------+----------
 2004-09-22 | Thing 0.00122669106349349 |        0
 2004-09-22 | Thing 0.00140673760324717 |        0
 2004-09-22 | Thing 0.00180063676089048 |        0
 2004-09-22 | Thing 0.00463481899350882 |        1
 2004-09-22 | Thing 0.00622459733858705 |        1
 2004-09-22 | Thing 0.00649207830429077 |        0
 2004-09-22 | Thing 0.00823836214840412 |        1
 2004-09-22 | Thing 0.0109024560078979  |        1
 2004-09-22 | Thing 0.0109436474740505  |        0
 2004-09-22 | Thing 0.0111544523388147  |        0
(10 rows)

Time: 6733.775 ms (00:06.734)


> On Jun 1, 2018, at 12:46 PM, Chris Wilson <ch...@simply-italian.co.uk> wrote:
> 
> Hi Rui,
> 
> Unfortunately sorting and limiting the CTE doesn't work properly, because 
> exactly which 100 rows are selected depends on values in the asset table, 
> which are not known at the time that the cte is evaluated.
> 
> I can work around it for our case by querying for the unique dates that make 
> it through the limit, and making the cte return all and only inventory 
> records matching those dates, but of course having this done automatically is 
> infinitely preferable. 
> 
> I'm really happy that this patch is actively being worked on and pushed 
> forward towards merging, and grateful to all involved in doing that. Thank 
> you for making Postgres even more awesome!
> 
> Thanks, Chris. 
> 
> Sent from my iPhone
> 
> On 1 Jun 2018, at 16:44, Rui DeSousa <rui.deso...@icloud.com 
> <mailto:rui.deso...@icloud.com>> wrote:
> 
>> In the meantime you can force it with CTE.
>> 
>> with inv as (
>>   select id_asset
>>     , inventory.date
>>     , quantity
>>   from inventory
>>   order by inventory.date
>>   limit 100
>> )
>> select inv.date, asset.name, inv.quantity
>> from inv
>> join asset on id_asset = asset.id <http://asset.id/>
>> order by inv.date, asset.name
>> ;
>> 
>>> On Jun 1, 2018, at 11:12 AM, Steven Winfield 
>>> <steven.winfi...@cantabcapital.com 
>>> <mailto:steven.winfi...@cantabcapital.com>> wrote:
>>> 
>>> It does indeed!
>>>                                                                            
>>> QUERY PLAN
>>> ----------------------------------------------------------------------------------------------------------------------------------------------------------------
>>> Limit  (cost=398.50..398.50 rows=100 width=32) (actual time=10.359..10.378 
>>> rows=100 loops=1)
>>>    Output: inventory.date, asset.name, inventory.quantity
>>>    ->  Incremental Sort  (cost=398.50..403.27 rows=5006001 width=32) 
>>> (actual time=10.357..10.365 rows=100 loops=1)
>>>          Output: inventory.date, asset.name, inventory.quantity
>>>          Sort Key: inventory.date, asset.name
>>>          Presorted Key: inventory.date
>>>          Sort Method: quicksort  Memory: 103kB
>>>          Sort Groups: 1
>>>          ->  Nested Loop Left Join  (cost=0.71..1702372.39 rows=5006001 
>>> width=32) (actual time=0.030..2.523 rows=1002 loops=1)
>>>                Output: inventory.date, asset.name, inventory.quantity
>>>                Inner Unique: true
>>>                ->  Index Scan using inventory_pkey on temp.inventory  
>>> (cost=0.43..238152.40 rows=5006001 width=12) (actual time=0.016..0.290 
>>> rows=1002 loops=1)
>>>                      Output: inventory.date, inventory.id_asset, 
>>> inventory.quantity
>>>                ->  Index Scan using asset_pkey on temp.asset  
>>> (cost=0.28..0.29 rows=1 width=28) (actual time=0.002..0.002 rows=1 
>>> loops=1002)
>>>                      Output: asset.id <http://asset.id/>, asset.name
>>>                      Index Cond: (asset.id <http://asset.id/> = 
>>> inventory.id_asset)
>>>  
>>> I’m guessing the feature-freeze for v11 means we won’t see this in the that 
>>> version, though, and the extra GUC it requires means it will be in v12 at 
>>> the earliest?
>>>  
>>> From: James Coleman [mailto:jtc...@gmail.com <mailto:jtc...@gmail.com>] 
>>> Sent: 01 June 2018 13:50
>>> To: Christopher Wilson
>>> Cc: pgsql-hackers@lists.postgresql.org 
>>> <mailto:pgsql-hackers@lists.postgresql.org>; Steven Winfield
>>> Subject: Re: FW: Possible optimisation: push down SORT and LIMIT nodes
>>>  
>>> The incremental sort patch seems to significantly improve performance for 
>>> your query: https://commitfest.postgresql.org/17/1124/ 
>>> <https://commitfest.postgresql.org/17/1124/>
>>>  
>>> On Fri, Jun 1, 2018 at 7:46 AM, Christopher Wilson 
>>> <chris.wil...@cantabcapital.com <mailto:chris.wil...@cantabcapital.com>> 
>>> wrote:
>>> Dear Postgres developers,
>>>  
>>> I sent this query to the performance list a couple of days ago, but nobody 
>>> has come up with any suggestions. I was wondering if you’d like to consider 
>>> it?
>>>  
>>> If this is interesting but nobody has time to implement it, then I would 
>>> potentially be willing to implement and submit it myself, in my own time. I 
>>> am experienced with C and C++, but I have not modified Postgres before, and 
>>> I would need significant support (e.g. on IRC) to help me to find my way 
>>> around the codebase and finish the task in an acceptable amount of time.
>>>  
>>> Thanks, Chris.
>>>  
>>>  
>>> 
>>> From: Christopher Wilson 
>>> Sent: 30 May 2018 16:47
>>> To: 'pgsql-performa...@postgresql.org 
>>> <mailto:pgsql-performa...@postgresql.org>'
>>> Cc: Steven Winfield (steven.winfi...@cantabcapital.com 
>>> <mailto:steven.winfi...@cantabcapital.com>)
>>> Subject: Possible optimisation: push down SORT and LIMIT nodes
>>>  
>>> Hi all,
>>>  
>>> We have a query which is rather slow (about 10 seconds), and it looks like 
>>> this:
>>>  
>>> select inventory.date, asset.name <http://asset.name/>, inventory.quantity
>>> from temp.inventory 
>>> left outer join temp.asset on asset.id <http://asset.id/> = id_asset
>>> order by inventory.date, asset.name <http://asset.name/>
>>> limit 100
>>>  
>>> The inventory table has the quantity of each asset in the inventory on each 
>>> date (complete SQL to create and populate the tables with dummy data is 
>>> below). The query plan looks like this (the non-parallel version is 
>>> similar):
>>>  
>>> <image001.png>
>>>  
>>> Or in text form:
>>>  
>>> Limit  (cost=217591.77..217603.60 rows=100 width=32) (actual 
>>> time=9122.235..9122.535 rows=100 loops=1)
>>>    Buffers: shared hit=6645, temp read=6363 written=6364
>>>    ->  Gather Merge  (cost=217591.77..790859.62 rows=4844517 width=32) 
>>> (actual time=9122.232..9122.518 rows=100 loops=1)
>>>          Workers Planned: 3
>>>          Workers Launched: 3
>>>          Buffers: shared hit=6645, temp read=6363 written=6364
>>>          ->  Sort  (cost=216591.73..220628.83 rows=1614839 width=32) 
>>> (actual time=8879.909..8880.030 rows=727 loops=4)
>>>                Sort Key: inventory.date, asset.name <http://asset.name/>
>>>                Sort Method: external merge  Disk: 50904kB
>>>                Buffers: shared hit=27365, temp read=25943 written=25947
>>>                ->  Hash Join  (cost=26.52..50077.94 rows=1614839 width=32) 
>>> (actual time=0.788..722.095 rows=1251500 loops=4)
>>>                      Hash Cond: (inventory.id_asset = asset.id 
>>> <http://asset.id/>)
>>>                      Buffers: shared hit=27236
>>>                      ->  Parallel Seq Scan on inventory  
>>> (cost=0.00..29678.39 rows=1614839 width=12) (actual time=0.025..237.977 
>>> rows=1251500 loops=4)
>>>                            Buffers: shared hit=27060
>>>                      ->  Hash  (cost=14.01..14.01 rows=1001 width=28) 
>>> (actual time=0.600..0.600 rows=1001 loops=4)
>>>                            Buckets: 1024  Batches: 1  Memory Usage: 68kB
>>>                            Buffers: shared hit=32
>>>                            ->  Seq Scan on asset  (cost=0.00..14.01 
>>> rows=1001 width=28) (actual time=0.026..0.279 rows=1001 loops=4)
>>>                                  Buffers: shared hit=32
>>> Planning time: 0.276 ms
>>> Execution time: 9180.144 ms
>>>  
>>> I can see why it does this, but I can also imagine a potential 
>>> optimisation, which would enable it to select far fewer rows from the 
>>> inventory table.
>>>  
>>> As we are joining to the primary key of the asset table, we know that this 
>>> join will not add extra rows to the output. Every output row comes from a 
>>> distinct inventory row. Therefore only 100 rows of the inventory table are 
>>> required. But which ones?
>>>  
>>> If we selected exactly 100 rows from inventory, ordered by date, then all 
>>> of the dates that were complete (every row for that date returned) would be 
>>> in the output. However, if there is a date which is incomplete (we haven’t 
>>> emitted all the inventory records for that date), then it’s possible that 
>>> we would need some records that we haven’t emitted yet. This can only be 
>>> known after joining to the asset table and sorting this last group by both 
>>> date and asset name. 
>>>  
>>> But we do know that there can only be 0 or 1 incomplete groups: either the 
>>> last group (by date) is incomplete, if the LIMIT cut it off in mid-group, 
>>> or its end coincided with the LIMIT boundary and it is complete. As long as 
>>> we continue selecting rows from this table until we exhaust the prefix of 
>>> the overall SORT which applies to it (in this case, just the date) then it 
>>> will be complete, and we will have all the inventory rows that can appear 
>>> in the output (because no possible values of columns that appear later in 
>>> the sort order can cause any rows with different dates to appear in the 
>>> output).
>>>  
>>> I’m imagining something like a sort-limit-finish node, which sorts its 
>>> input and then returns at least the limit number of rows, but keeps 
>>> returning rows until it exhausts the last sort prefix that it read.
>>>  
>>> This node could be created from an ordinary SORT and LIMIT pair:
>>>  
>>> SORT + LIMIT -> SORT-LIMIT-FINISH + SORT + LIMIT
>>>  
>>> And then pushed down through either a left join, or an inner join on a 
>>> foreign key, when the right side is unique, and no columns from the right 
>>> side appear in WHERE conditions, nor anywhere in the sort order except at 
>>> the end. This sort column suffix would be removed by pushing the node down. 
>>> The resulting query plan would then look something like:
>>>  
>>> Index Scan on inventory
>>> SORT-LIMIT-FINISH(sort=[inventory.date], limit=100) (pushed down through 
>>> the join to asset)
>>> Seq Scan on asset
>>> Hash Left Join (inventory.id_asset = asset.id <http://asset.id/>)
>>> Sort (inventory.date, asset.name <http://asset.name/>)
>>> Limit (100)
>>>  
>>> And would emit only about 100-1000 inventory rows from the index scan.
>>>  
>>> Does this sound correct, reasonable and potentially interesting to Postgres 
>>> developers?
>>>  
>>> SQL to reproduce:
>>>  
>>> create schema temp;
>>> create table temp.asset (
>>>         id serial primary key,
>>>         name text
>>> );
>>> insert into temp.asset (name) select 'Thing ' || random()::text as name 
>>> from generate_series(0, 1000) as s;
>>> create table temp.inventory (
>>>         date date,
>>>         id_asset int,
>>>         quantity int,
>>>         primary key (date, id_asset),
>>>         CONSTRAINT id_asset_fk FOREIGN KEY (id_asset) REFERENCES temp.asset 
>>> (id) MATCH SIMPLE
>>> );
>>> insert into temp.inventory (date, id_asset, quantity)
>>> select current_date - days, asset.id <http://asset.id/>, random() from 
>>> temp.asset, generate_series(0, 5000) as days;
>>>  
>>> Thanks, Chris.
>>>  
>>> This email is confidential. If you are not the intended recipient, please 
>>> advise us immediately and delete this message. The registered name of 
>>> Cantab- part of GAM Systematic is Cantab Capital Partners LLP. See - 
>>> http://www.gam.com/en/Legal/Email+disclosures+EU 
>>> <http://www.gam.com/en/Legal/Email+disclosures+EU> for further information 
>>> on confidentiality, the risks of non-secure electronic communication, and 
>>> certain disclosures which we are required to make in accordance with 
>>> applicable legislation and regulations. If you cannot access this link, 
>>> please notify us by reply message and we will send the contents to you.
>>> 
>>> GAM Holding AG and its subsidiaries (Cantab – GAM Systematic) will collect 
>>> and use information about you in the course of your interactions with us. 
>>> Full details about the data types we collect and what we use this for and 
>>> your related rights is set out in our online privacy policy at 
>>> https://www.gam.com/en/legal/privacy-policy 
>>> <https://www.gam.com/en/legal/privacy-policy>. Please familiarise yourself 
>>> with this policy and check it from time to time for updates as it 
>>> supplements this notice
>>

Re: Possible optimisation: push down SORT and LIMIT nodes

Reply via email to