alamb commented on PR #13133:
URL: https://github.com/apache/datafusion/pull/13133#issuecomment-2447695611

   I also verified that some of these queries that got faster actually use 
SortPreservingMerge which they do:
   
   ```
   andrewlamb@Andrews-MacBook-Pro-2:~/Downloads$ datafusion-cli -f q38.sql
   DataFusion CLI v42.1.0
   
+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | plan_type     | plan                                                       
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                       |
   
+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | logical_plan  | Limit: skip=1000, fetch=10                                 
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                       |
   |               |   Sort: pageviews DESC NULLS FIRST, fetch=1010             
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                       |
   |               |     Projection: hits.URL, count(*) AS pageviews            
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                       |
   |               |       Aggregate: groupBy=[[hits.URL]], 
aggr=[[count(Int64(1)) AS count(*)]]                                            
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                          
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                       |
   |               |         Projection: hits.URL                               
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                       |
   |               |           Filter: hits.CounterID = Int32(62) AND 
CAST(CAST(hits.EventDate AS Int32) AS Date32) >= Date32("2013-07-01") AND 
CAST(CAST(hits.EventDate AS Int32) AS Date32) <= Date32("2013-07-31") AND 
hits.IsRefresh = Int16(0) AND hits.IsLink != Int16(0) AND hits.IsDownload = 
Int16(0)                                                                        
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                       |
   |               |             TableScan: hits projection=[EventDate, 
CounterID, URL, IsRefresh, IsLink, IsDownload], partial_filters=[hits.CounterID 
= Int32(62), CAST(CAST(hits.EventDate AS Int32) AS Date32) >= 
Date32("2013-07-01"), CAST(CAST(hits.EventDate AS Int32) AS Date32) <= 
Date32("2013-07-31"), hits.IsRefresh = Int16(0), hits.IsLink != Int16(0), 
hits.IsDownload = Int16(0)]                                                     
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                               
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                       |
   | physical_plan | GlobalLimitExec: skip=1000, fetch=10                       
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                       |
   |               |   SortPreservingMergeExec: [pageviews@1 DESC], fetch=1010  
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                       |
   |               |     SortExec: TopK(fetch=1010), expr=[pageviews@1 DESC], 
preserve_partitioning=[true]                                                    
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                        
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                       |
   |               |       ProjectionExec: expr=[URL@0 as URL, count(*)@1 as 
pageviews]                                                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                         
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                       |
   |               |         AggregateExec: mode=FinalPartitioned, gby=[URL@0 
as URL], aggr=[count(*)]                                                        
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                        
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                       |
   |               |           CoalesceBatchesExec: target_batch_size=8192      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                       |
   |               |             RepartitionExec: partitioning=Hash([URL@0], 
16), input_partitions=16                                                        
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                         
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                       |
   |               |               AggregateExec: mode=Partial, gby=[URL@0 as 
URL], aggr=[count(*)]                                                           
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                        
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                       |
   |               |                 CoalesceBatchesExec: 
target_batch_size=8192                                                          
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                            
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                       |
   |               |                   FilterExec: CounterID@1 = 62 AND 
CAST(CAST(EventDate@0 AS Int32) AS Date32) >= 2013-07-01 AND 
CAST(CAST(EventDate@0 AS Int32) AS Date32) <= 2013-07-31 AND IsRefresh@3 = 0 
AND IsLink@4 != 0 AND IsDownload@5 = 0, projection=[URL@2]                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                    
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                       |
   |               |                     ParquetExec: file_groups={16 groups: 
[[Users/andrewlamb/Downloads/hits/hits_0.parquet:0..122446530, 
Users/andrewlamb/Downloads/hits/hits_1.parquet:0..174965044, 
Users/andrewlamb/Downloads/hits/hits_10.parquet:0..101513258, 
Users/andrewlamb/Downloads/hits/hits_11.parquet:0..118419888, 
Users/andrewlamb/Downloads/hits/hits_12.parquet:0..149514164, ...], 
[Users/andrewlamb/Downloads/hits/hits_14.parquet:108113265..151121699, 
Users/andrewlamb/Downloads/hits/hits_15.parquet:0..103098894, 
Users/andrewlamb/Downloads/hits/hits_16.parquet:0..101067219, 
Users/andrewlamb/Downloads/hits/hits_17.parquet:0..116867853, 
Users/andrewlamb/Downloads/hits/hits_18.parquet:0..133119589, ...], 
[Users/andrewlamb/Downloads/hits/hits_21.parquet:3887560..113455196, 
Users/andrewlamb/Downloads/hits/hits_22.parquet:0..79775901, 
Users/andrewlamb/Downloads/hits/hits_23.parquet:0..79631107, 
Users/andrewlamb/Downloads/hits/hits_24.parquet:0..78257049, 
Users/andrewlamb/Downloads/
 hits/hits_25.parquet:0..144169728, ...], 
[Users/andrewlamb/Downloads/hits/hits_28.parquet:106905624..162772407, 
Users/andrewlamb/Downloads/hits/hits_29.parquet:0..79213288, 
Users/andrewlamb/Downloads/hits/hits_3.parquet:0..192507052, 
Users/andrewlamb/Downloads/hits/hits_30.parquet:0..124187913, 
Users/andrewlamb/Downloads/hits/hits_31.parquet:0..123065410, ...], 
[Users/andrewlamb/Downloads/hits/hits_35.parquet:54087340..153632381, 
Users/andrewlamb/Downloads/hits/hits_36.parquet:0..92487304, 
Users/andrewlamb/Downloads/hits/hits_37.parquet:0..108247781, 
Users/andrewlamb/Downloads/hits/hits_38.parquet:0..132005180, 
Users/andrewlamb/Downloads/hits/hits_39.parquet:0..103522954, ...], ...]}, 
projection=[EventDate, CounterID, URL, IsRefresh, IsLink, IsDownload], 
predicate=CounterID@6 = 62 AND CAST(CAST(EventDate@5 AS Int32) AS Date32) >= 
2013-07-01 AND CAST(CAST(EventDate@5 AS Int32) AS Date32) <= 2013-07-31 AND 
IsRefresh@15 = 0 AND IsLink@52 != 0 AND IsDownload@53 = 0, pruning_predicate=CA
 SE WHEN CounterID_null_count@2 = CounterID_row_count@3 THEN false ELSE 
CounterID_min@0 <= 62 AND 62 <= CounterID_max@1 END AND CASE WHEN 
IsRefresh_null_count@6 = IsRefresh_row_count@7 THEN false ELSE IsRefresh_min@4 
<= 0 AND 0 <= IsRefresh_max@5 END AND CASE WHEN IsLink_null_count@10 = 
IsLink_row_count@11 THEN false ELSE IsLink_min@8 != 0 OR 0 != IsLink_max@9 END 
AND CASE WHEN IsDownload_null_count@14 = IsDownload_row_count@15 THEN false 
ELSE IsDownload_min@12 <= 0 AND 0 <= IsDownload_max@13 END, 
required_guarantees=[CounterID in (62), IsDownload in (0), IsLink not in (0), 
IsRefresh in (0)] |
   |               |                                                            
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                       |
   
+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   2 row(s) fetched.
   Elapsed 0.090 seconds.
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to