alamb opened a new issue, #15177:
URL: https://github.com/apache/datafusion/issues/15177

   ### Is your feature request related to a problem or challenge?
   
   Part of https://github.com/apache/datafusion/issues/14586
   
   [Comparing ClickBench on DataFusion 45 and DuckDB 
(link)](https://benchmark.clickhouse.com/#eyJzeXN0ZW0iOnsiQWxsb3lEQiI6ZmFsc2UsIkFsbG95REIgKHR1bmVkKSI6ZmFsc2UsIkF0aGVuYSAocGFydGl0aW9uZWQpIjpmYWxzZSwiQXRoZW5hIChzaW5nbGUpIjpmYWxzZSwiQXVyb3JhIGZvciBNeVNRTCI6ZmFsc2UsIkF1cm9yYSBmb3IgUG9zdGdyZVNRTCI6ZmFsc2UsIkJ5Q29uaXR5IjpmYWxzZSwiQnl0ZUhvdXNlIjpmYWxzZSwiY2hEQiAoRGF0YUZyYW1lKSI6ZmFsc2UsImNoREIgKFBhcnF1ZXQsIHBhcnRpdGlvbmVkKSI6ZmFsc2UsImNoREIiOmZhbHNlLCJDaXR1cyI6ZmFsc2UsIkNsaWNrSG91c2UgQ2xvdWQgKGF3cykiOmZhbHNlLCJDbGlja0hvdXNlIENsb3VkIChhenVyZSkiOmZhbHNlLCJDbGlja0hvdXNlIENsb3VkIChnY3ApIjpmYWxzZSwiQ2xpY2tIb3VzZSAoZGF0YSBsYWtlLCBwYXJ0aXRpb25lZCkiOmZhbHNlLCJDbGlja0hvdXNlIChkYXRhIGxha2UsIHNpbmdsZSkiOmZhbHNlLCJDbGlja0hvdXNlIChQYXJxdWV0LCBwYXJ0aXRpb25lZCkiOmZhbHNlLCJDbGlja0hvdXNlIChQYXJxdWV0LCBzaW5nbGUpIjpmYWxzZSwiQ2xpY2tIb3VzZSAod2ViKSI6ZmFsc2UsIkNsaWNrSG91c2UiOmZhbHNlLCJDbGlja0hvdXNlICh0dW5lZCkiOmZhbHNlLCJDbGlja0hvdXNlICh0dW5lZCwgbWVtb3J5KSI6ZmFsc2UsIkNsb3VkYmVycnkiOmZhbHNlLCJDcmF0ZURCIjpmYWx
 
zZSwiQ3J1bmNoeSBCcmlkZ2UgZm9yIEFuYWx5dGljcyAoUGFycXVldCkiOmZhbHNlLCJEYXRhYmVuZCI6ZmFsc2UsIkRhdGFGdXNpb24gKFBhcnF1ZXQsIHBhcnRpdGlvbmVkKSI6dHJ1ZSwiRGF0YUZ1c2lvbiAoUGFycXVldCwgc2luZ2xlKSI6ZmFsc2UsIkFwYWNoZSBEb3JpcyI6ZmFsc2UsIkRyaWxsIjpmYWxzZSwiRHJ1aWQiOmZhbHNlLCJEdWNrREIgKERhdGFGcmFtZSkiOmZhbHNlLCJEdWNrREIgKG1lbW9yeSkiOmZhbHNlLCJEdWNrREIgKFBhcnF1ZXQsIHBhcnRpdGlvbmVkKSI6dHJ1ZSwiRHVja0RCIjpmYWxzZSwiRWxhc3RpY3NlYXJjaCI6ZmFsc2UsIkVsYXN0aWNzZWFyY2ggKHR1bmVkKSI6ZmFsc2UsIkdsYXJlREIiOmZhbHNlLCJHcmVlbnBsdW0iOmZhbHNlLCJIZWF2eUFJIjpmYWxzZSwiSHlkcmEiOmZhbHNlLCJTYWxlc2ZvcmNlIEh5cGVyIChQYXJxdWV0KSI6ZmFsc2UsIlNhbGVzZm9yY2UgSHlwZXIiOmZhbHNlLCJJbmZvYnJpZ2h0IjpmYWxzZSwiS2luZXRpY2EiOmZhbHNlLCJNYXJpYURCIENvbHVtblN0b3JlIjpmYWxzZSwiTWFyaWFEQiI6ZmFsc2UsIk1vbmV0REIiOmZhbHNlLCJNb25nb0RCIjpmYWxzZSwiTW90aGVyRHVjayI6ZmFsc2UsIk15U1FMIChNeUlTQU0pIjpmYWxzZSwiTXlTUUwiOmZhbHNlLCJPY3RvU1FMIjpmYWxzZSwiT3B0ZXJ5eCI6ZmFsc2UsIk94bGEiOmZhbHNlLCJQYW5kYXMgKERhdGFGcmFtZSkiOmZhbHNlLCJQYXJhZGVEQiAoUGFycXVldCwgcGFydGl0aW9uZWQpIjpm
 
YWxzZSwiUGFyYWRlREIgKFBhcnF1ZXQsIHNpbmdsZSkiOmZhbHNlLCJwZ19kdWNrZGIgKHdpdGggaW5kZXhlcykiOmZhbHNlLCJwZ19kdWNrZGIgKE1vdGhlckR1Y2sgZW5hYmxlZCkiOmZhbHNlLCJwZ19kdWNrZGIiOmZhbHNlLCJwZ19kdWNrZGIgKFBhcnF1ZXQpIjpmYWxzZSwiUG9zdGdyZVNRTCB3aXRoIHBnX21vb25jYWtlIjpmYWxzZSwiUGlub3QiOmZhbHNlLCJQb2xhcnMgKERhdGFGcmFtZSkiOmZhbHNlLCJQb2xhcnMgKFBhcnF1ZXQpIjpmYWxzZSwiUG9zdGdyZVNRTCAod2l0aCBpbmRleGVzKSI6ZmFsc2UsIlBvc3RncmVTUUwiOmZhbHNlLCJRdWVzdERCIjpmYWxzZSwiUmVkc2hpZnQiOmZhbHNlLCJTZWxlY3REQiI6ZmFsc2UsIlNpbmdsZVN0b3JlIjpmYWxzZSwiU25vd2ZsYWtlIjpmYWxzZSwiU3BhcmsiOmZhbHNlLCJTUUxpdGUiOmZhbHNlLCJTdGFyUm9ja3MiOmZhbHNlLCJUYWJsZXNwYWNlIjpmYWxzZSwiVGVtYm8gT0xBUCAoY29sdW1uYXIpIjpmYWxzZSwiVGltZXNjYWxlIENsb3VkIjpmYWxzZSwiVGltZXNjYWxlREIgKG5vIGNvbHVtbnN0b3JlKSI6ZmFsc2UsIlRpbWVzY2FsZURCIjpmYWxzZSwiVGlueWJpcmQgKEZyZWUgVHJpYWwpIjpmYWxzZSwiVW1icmEiOmZhbHNlLCJVcnNhIjpmYWxzZSwiVmljdG9yaWFMb2dzIjpmYWxzZX0sInR5cGUiOnsiQyI6dHJ1ZSwiY29sdW1uLW9yaWVudGVkIjp0cnVlLCJQb3N0Z3JlU1FMIGNvbXBhdGlibGUiOnRydWUsIm1hbmFnZWQiOnRydWUsImdjcCI6d
 
HJ1ZSwic3RhdGVsZXNzIjp0cnVlLCJKYXZhIjp0cnVlLCJDKysiOnRydWUsIk15U1FMIGNvbXBhdGlibGUiOnRydWUsInJvdy1vcmllbnRlZCI6dHJ1ZSwiQ2xpY2tIb3VzZSBkZXJpdmF0aXZlIjp0cnVlLCJlbWJlZGRlZCI6dHJ1ZSwic2VydmVybGVzcyI6dHJ1ZSwiZGF0YWZyYW1lIjp0cnVlLCJhd3MiOnRydWUsImF6dXJlIjp0cnVlLCJhbmFseXRpY2FsIjp0cnVlLCJSdXN0Ijp0cnVlLCJzZWFyY2giOnRydWUsImRvY3VtZW50Ijp0cnVlLCJHbyI6dHJ1ZSwic29tZXdoYXQgUG9zdGdyZVNRTCBjb21wYXRpYmxlIjp0cnVlLCJEYXRhRnJhbWUiOnRydWUsInBhcnF1ZXQiOnRydWUsInRpbWUtc2VyaWVzIjp0cnVlfSwibWFjaGluZSI6eyIxNiB2Q1BVIDEyOEdCIjpmYWxzZSwiOCB2Q1BVIDY0R0IiOmZhbHNlLCJzZXJ2ZXJsZXNzIjpmYWxzZSwiMTZhY3UiOmZhbHNlLCJjNmEuNHhsYXJnZSwgNTAwZ2IgZ3AyIjp0cnVlLCJMIjpmYWxzZSwiTSI6ZmFsc2UsIlMiOmZhbHNlLCJYUyI6ZmFsc2UsImM2YS5tZXRhbCwgNTAwZ2IgZ3AyIjpmYWxzZSwiMTJHaUIsIDEgcmVwbGljYShzKSI6ZmFsc2UsIjhHaUIsIDEgcmVwbGljYShzKSI6ZmFsc2UsIjEyR2lCLCAyIHJlcGxpY2EocykiOmZhbHNlLCIxMjBHaUIsIDIgcmVwbGljYShzKSI6ZmFsc2UsIjE2R2lCLCAyIHJlcGxpY2EocykiOmZhbHNlLCIyMzZHaUIsIDIgcmVwbGljYShzKSI6ZmFsc2UsIjMyR2lCLCAyIHJlcGxpY2EocykiOmZhbHNlLCI2NEdpQiwgMiByZX
 
BsaWNhKHMpIjpmYWxzZSwiOEdpQiwgMiByZXBsaWNhKHMpIjpmYWxzZSwiMTJHaUIsIDMgcmVwbGljYShzKSI6ZmFsc2UsIjEyMEdpQiwgMyByZXBsaWNhKHMpIjpmYWxzZSwiMTZHaUIsIDMgcmVwbGljYShzKSI6ZmFsc2UsIjIzNkdpQiwgMyByZXBsaWNhKHMpIjpmYWxzZSwiMzJHaUIsIDMgcmVwbGljYShzKSI6ZmFsc2UsIjY0R2lCLCAzIHJlcGxpY2EocykiOmZhbHNlLCI4R2lCLCAzIHJlcGxpY2EocykiOmZhbHNlLCJjNW4uNHhsYXJnZSwgNTAwZ2IgZ3AyIjpmYWxzZSwiQW5hbHl0aWNzLTI1NkdCICg2NCB2Q29yZXMsIDI1NiBHQikiOmZhbHNlLCJjNS40eGxhcmdlLCA1MDBnYiBncDIiOmZhbHNlLCJjNmEuNHhsYXJnZSwgMTUwMGdiIGdwMiI6ZmFsc2UsIlhMIjpmYWxzZSwiSnVtYm8iOmZhbHNlLCJQdWxzZSI6ZmFsc2UsIlN0YW5kYXJkIjpmYWxzZSwiZGMyLjh4bGFyZ2UiOmZhbHNlLCJyYTMuMTZ4bGFyZ2UiOmZhbHNlLCJyYTMuNHhsYXJnZSI6ZmFsc2UsInJhMy54bHBsdXMiOmZhbHNlLCJTMiI6ZmFsc2UsIlMyNCI6ZmFsc2UsIjJYTCI6ZmFsc2UsIjNYTCI6ZmFsc2UsIjRYTCI6ZmFsc2UsIkwxIC0gMTZDUFUgMzJHQiI6ZmFsc2UsImM2YS40eGxhcmdlLCA1MDBnYiBncDMiOmZhbHNlLCIxNiB2Q1BVIDY0R0IiOmZhbHNlLCI0IHZDUFUgMTZHQiI6ZmFsc2UsIjggdkNQVSAzMkdCIjpmYWxzZX0sImNsdXN0ZXJfc2l6ZSI6eyIxIjp0cnVlLCIyIjp0cnVlLCIzIjp0cnVlLCI0Ijp0cnVlLCI4Ijp0cnV
 
lLCIxNiI6dHJ1ZSwiMzIiOnRydWUsIjY0Ijp0cnVlLCIxMjgiOnRydWUsInNlcnZlcmxlc3MiOnRydWUsInVuZGVmaW5lZCI6dHJ1ZX0sIm1ldHJpYyI6ImhvdCIsInF1ZXJpZXMiOlt0cnVlLHRydWUsdHJ1ZSx0cnVlLHRydWUsdHJ1ZSx0cnVlLHRydWUsdHJ1ZSx0cnVlLHRydWUsdHJ1ZSx0cnVlLHRydWUsdHJ1ZSx0cnVlLHRydWUsdHJ1ZSx0cnVlLHRydWUsdHJ1ZSx0cnVlLHRydWUsdHJ1ZSx0cnVlLHRydWUsdHJ1ZSx0cnVlLHRydWUsdHJ1ZSx0cnVlLHRydWUsdHJ1ZSx0cnVlLHRydWUsdHJ1ZSx0cnVlLHRydWUsdHJ1ZSx0cnVlLHRydWUsdHJ1ZSx0cnVlXX0=)
   
   You can see that for 23 DataFusion is almost 2x slower (around 10s where 
DuckDB is 5s)
   
![Image](https://github.com/user-attachments/assets/fe6ed804-5058-45ad-9d9b-b14189a1bd65)
   
   
   You can run this query like this:
   ```shell
   cd datafusion
   cd benchmarks
   # download data
   ./bench.sh data clickbench_partitioned
   # run query with datafusion-cli (note escapes
   datafusion-cli -c "SELECT * FROM 'data/hits_partitioned' WHERE \"URL\" LIKE 
'%google%' ORDER BY \"EventTime\" LIMIT 10;"
   ```
   
   Here is the explain plan
   
   ```
   andrewlamb@Andrews-MacBook-Pro-2:~/Software/datafusion/benchmarks$ 
datafusion-cli -c "EXPLAIN SELECT * FROM 'data/hits_partitioned' WHERE \"URL\" 
LIKE '%google%' ORDER BY \"EventTime\" LIMIT 10;"
   DataFusion CLI v46.0.0
   
+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
-----------------------------------------------------------------------------------------------------------+
   | plan_type     | plan                                                       
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                            |
   
+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
-----------------------------------------------------------------------------------------------------------+
   | logical_plan  | Sort: data/hits_partitioned.EventTime ASC NULLS LAST, 
fetch=10                                                                        
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                           
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                            |
   |               |   Filter: CAST(data/hits_partitioned.URL AS Utf8View) LIKE 
Utf8View("%google%")                                                            
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                            |
   |               |     TableScan: data/hits_partitioned projection=[WatchID, 
JavaEnable, Title, GoodEvent, EventTime, EventDate, CounterID, ClientIP, 
RegionID, UserID, CounterClass, OS, UserAgent, URL, Referer, IsRefresh, 
RefererCategoryID, RefererRegionID, URLCategoryID, URLRegionID, 
ResolutionWidth, ResolutionHeight, ResolutionDepth, FlashMajor, FlashMinor, 
FlashMinor2, NetMajor, NetMinor, UserAgentMajor, UserAgentMinor, CookieEnable, 
JavascriptEnable, IsMobile, MobilePhone, MobilePhoneModel, Params, IPNetworkID, 
TraficSourceID, SearchEngineID, SearchPhrase, AdvEngineID, IsArtifical, 
WindowClientWidth, WindowClientHeight, ClientTimeZone, ClientEventTime, 
SilverlightVersion1, SilverlightVersion2, SilverlightVersion3, 
SilverlightVersion4, PageCharset, CodeVersion, IsLink, IsDownload, IsNotBounce, 
FUniqID, OriginalURL, HID, IsOldCounter, IsEvent, IsParameter, DontCountHits, 
WithHash, HitColor, LocalEventTime, Age, Sex, Income, Interests, Robotness, 
RemoteIP, WindowName, OpenerName, 
 HistoryLength, BrowserLanguage, BrowserCountry, SocialNetwork, SocialAction, 
HTTPError, SendTiming, DNSTiming, ConnectTiming, ResponseStartTiming, 
ResponseEndTiming, FetchTiming, SocialSourceNetworkID, SocialSourcePage, 
ParamPrice, ParamOrderID, ParamCurrency, ParamCurrencyID, OpenstatServiceName, 
OpenstatCampaignID, OpenstatAdID, OpenstatSourceID, UTMSource, UTMMedium, 
UTMCampaign, UTMContent, UTMTerm, FromTag, HasGCLID, RefererHash, URLHash, 
CLID], partial_filters=[CAST(data/hits_partitioned.URL AS Utf8View) LIKE 
Utf8View("%google%")]                                                           
                                                                                
                                                                                
                                                                                
                                                                                
                                                                            
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                            |
   | physical_plan | SortPreservingMergeExec: [EventTime@4 ASC NULLS LAST], 
fetch=10                                                                        
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                          
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                            |
   |               |   SortExec: TopK(fetch=10), expr=[EventTime@4 ASC NULLS 
LAST], preserve_partitioning=[true]                                             
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                         
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                            |
   |               |     CoalesceBatchesExec: target_batch_size=8192            
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                            |
   |               |       FilterExec: CAST(URL@13 AS Utf8View) LIKE %google%   
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                            |
   |               |         DataSourceExec: file_groups={16 groups: 
[[Users/andrewlamb/Software/datafusion/benchmarks/data/hits_partitioned/hits_0.parquet:0..122446530,
 
Users/andrewlamb/Software/datafusion/benchmarks/data/hits_partitioned/hits_1.parquet:0..174965044,
 
Users/andrewlamb/Software/datafusion/benchmarks/data/hits_partitioned/hits_10.parquet:0..101513258,
 
Users/andrewlamb/Software/datafusion/benchmarks/data/hits_partitioned/hits_11.parquet:0..118419888,
 
Users/andrewlamb/Software/datafusion/benchmarks/data/hits_partitioned/hits_12.parquet:0..149514164,
 ...], 
[Users/andrewlamb/Software/datafusion/benchmarks/data/hits_partitioned/hits_14.parquet:108113265..151121699,
 
Users/andrewlamb/Software/datafusion/benchmarks/data/hits_partitioned/hits_15.parquet:0..103098894,
 
Users/andrewlamb/Software/datafusion/benchmarks/data/hits_partitioned/hits_16.parquet:0..101067219,
 
Users/andrewlamb/Software/datafusion/benchmarks/data/hits_partitioned/hits_17.parquet:0..116867853,
 Users/andrewla
 
mb/Software/datafusion/benchmarks/data/hits_partitioned/hits_18.parquet:0..133119589,
 ...], 
[Users/andrewlamb/Software/datafusion/benchmarks/data/hits_partitioned/hits_21.parquet:3887560..113455196,
 
Users/andrewlamb/Software/datafusion/benchmarks/data/hits_partitioned/hits_22.parquet:0..79775901,
 
Users/andrewlamb/Software/datafusion/benchmarks/data/hits_partitioned/hits_23.parquet:0..79631107,
 
Users/andrewlamb/Software/datafusion/benchmarks/data/hits_partitioned/hits_24.parquet:0..78257049,
 
Users/andrewlamb/Software/datafusion/benchmarks/data/hits_partitioned/hits_25.parquet:0..144169728,
 ...], 
[Users/andrewlamb/Software/datafusion/benchmarks/data/hits_partitioned/hits_28.parquet:106905624..162772407,
 
Users/andrewlamb/Software/datafusion/benchmarks/data/hits_partitioned/hits_29.parquet:0..79213288,
 
Users/andrewlamb/Software/datafusion/benchmarks/data/hits_partitioned/hits_3.parquet:0..192507052,
 
Users/andrewlamb/Software/datafusion/benchmarks/data/hits_partitioned/hits_30.parquet:0.
 .124187913, 
Users/andrewlamb/Software/datafusion/benchmarks/data/hits_partitioned/hits_31.parquet:0..123065410,
 ...], 
[Users/andrewlamb/Software/datafusion/benchmarks/data/hits_partitioned/hits_35.parquet:54087340..153632381,
 
Users/andrewlamb/Software/datafusion/benchmarks/data/hits_partitioned/hits_36.parquet:0..92487304,
 
Users/andrewlamb/Software/datafusion/benchmarks/data/hits_partitioned/hits_37.parquet:0..108247781,
 
Users/andrewlamb/Software/datafusion/benchmarks/data/hits_partitioned/hits_38.parquet:0..132005180,
 
Users/andrewlamb/Software/datafusion/benchmarks/data/hits_partitioned/hits_39.parquet:0..103522954,
 ...], ...]}, projection=[WatchID, JavaEnable, Title, GoodEvent, EventTime, 
EventDate, CounterID, ClientIP, RegionID, UserID, CounterClass, OS, UserAgent, 
URL, Referer, IsRefresh, RefererCategoryID, RefererRegionID, URLCategoryID, 
URLRegionID, ResolutionWidth, ResolutionHeight, ResolutionDepth, FlashMajor, 
FlashMinor, FlashMinor2, NetMajor, NetMinor, UserAgentMajor, User
 AgentMinor, CookieEnable, JavascriptEnable, IsMobile, MobilePhone, 
MobilePhoneModel, Params, IPNetworkID, TraficSourceID, SearchEngineID, 
SearchPhrase, AdvEngineID, IsArtifical, WindowClientWidth, WindowClientHeight, 
ClientTimeZone, ClientEventTime, SilverlightVersion1, SilverlightVersion2, 
SilverlightVersion3, SilverlightVersion4, PageCharset, CodeVersion, IsLink, 
IsDownload, IsNotBounce, FUniqID, OriginalURL, HID, IsOldCounter, IsEvent, 
IsParameter, DontCountHits, WithHash, HitColor, LocalEventTime, Age, Sex, 
Income, Interests, Robotness, RemoteIP, WindowName, OpenerName, HistoryLength, 
BrowserLanguage, BrowserCountry, SocialNetwork, SocialAction, HTTPError, 
SendTiming, DNSTiming, ConnectTiming, ResponseStartTiming, ResponseEndTiming, 
FetchTiming, SocialSourceNetworkID, SocialSourcePage, ParamPrice, ParamOrderID, 
ParamCurrency, ParamCurrencyID, OpenstatServiceName, OpenstatCampaignID, 
OpenstatAdID, OpenstatSourceID, UTMSource, UTMMedium, UTMCampaign, UTMContent, 
UTMTerm, FromTag, 
 HasGCLID, RefererHash, URLHash, CLID], file_type=parquet, 
predicate=CAST(URL@13 AS Utf8View) LIKE %google% |
   |               |                                                            
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                            |
   
+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
-----------------------------------------------------------------------------------------------------------+
   2 row(s) fetched.
   Elapsed 0.056 seconds.
   ```
   
   Something that immediately jumps out at me in the explain plan is this line
   
   ```
   |               |         DataSourceExec: file_groups={16 groups: ...}, 
projection=[WatchID, JavaEnable, Title, GoodEvent, EventTime, EventDate, 
CounterID, ClientIP, RegionID, UserID, CounterClass, OS, UserAgent, URL, 
Referer, IsRefresh, RefererCategoryID, RefererRegionID, URLCategoryID, 
URLRegionID, ResolutionWidth, ResolutionHeight, ResolutionDepth, FlashMajor, 
FlashMinor, FlashMinor2, NetMajor, NetMinor, UserAgentMajor, UserAgentMinor, 
CookieEnable, JavascriptEnable, IsMobile, MobilePhone, MobilePhoneModel, 
Params, IPNetworkID, TraficSourceID, SearchEngineID, SearchPhrase, AdvEngineID, 
IsArtifical, WindowClientWidth, WindowClientHeight, ClientTimeZone, 
ClientEventTime, SilverlightVersion1, SilverlightVersion2, SilverlightVersion3, 
SilverlightVersion4, PageCharset, CodeVersion, IsLink, IsDownload, IsNotBounce, 
FUniqID, OriginalURL, HID, IsOldCounter, IsEvent, IsParameter, DontCountHits, 
WithHash, HitColor, LocalEventTime, Age, Sex, Income, Interests, Robotness, 
RemoteIP, WindowN
 ame, OpenerName, HistoryLength, BrowserLanguage, BrowserCountry, 
SocialNetwork, SocialAction, HTTPError, SendTiming, DNSTiming, ConnectTiming, 
ResponseStartTiming, ResponseEndTiming, FetchTiming, SocialSourceNetworkID, 
SocialSourcePage, ParamPrice, ParamOrderID, ParamCurrency, ParamCurrencyID, 
OpenstatServiceName, OpenstatCampaignID, OpenstatAdID, OpenstatSourceID, 
UTMSource, UTMMedium, UTMCampaign, UTMContent, UTMTerm, FromTag, HasGCLID, 
RefererHash, URLHash, CLID], file_type=parquet, predicate=CAST(URL@13 AS 
Utf8View) LIKE %google% |
   ```
   
    "Projection" I think means that all of those columns are being read/ 
decoded from parquet, which makes sense as the query has a `SELECT *` on it.
   
   However, in this case all but the top 10 rows are returned (out of 100M rows 
in the file)
   
   So this means that most of the decoded data is decoded and thrown away 
immediately
   
   
   
   
   
   ### Describe the solution you'd like
   
   I would like to close the gap with DuckDB with some general purpose 
improvement
   
   ### Describe alternatives you've considered
   
   I think the way to improve performance here is to defer decoding 
("Materializing") the other columns until we know what the top 10 rows are.
   
   some wacky ideas:
   1. Push the topk / ordering into the scan somehow
   2. implement "late materialization" 
   
   Late materialization would look something like
   1. decode only the EventTime column and a `row_id`
   2. determine the top 10 row_id by sorting by EventTime
   3. Decode only those 10 rows from the parquet file(s)
   
   
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to