alamb commented on PR #13133: URL: https://github.com/apache/datafusion/pull/13133#issuecomment-2447695611
I also verified that some of these queries that got faster actually use
SortPreservingMerge which they do:
```
andrewlamb@Andrews-MacBook-Pro-2:~/Downloads$ datafusion-cli -f q38.sql
DataFusion CLI v42.1.0
+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| plan_type | plan
|
+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| logical_plan | Limit: skip=1000, fetch=10
|
| | Sort: pageviews DESC NULLS FIRST, fetch=1010
|
| | Projection: hits.URL, count(*) AS pageviews
|
| | Aggregate: groupBy=[[hits.URL]],
aggr=[[count(Int64(1)) AS count(*)]]
|
| | Projection: hits.URL
|
| | Filter: hits.CounterID = Int32(62) AND
CAST(CAST(hits.EventDate AS Int32) AS Date32) >= Date32("2013-07-01") AND
CAST(CAST(hits.EventDate AS Int32) AS Date32) <= Date32("2013-07-31") AND
hits.IsRefresh = Int16(0) AND hits.IsLink != Int16(0) AND hits.IsDownload =
Int16(0)
|
| | TableScan: hits projection=[EventDate,
CounterID, URL, IsRefresh, IsLink, IsDownload], partial_filters=[hits.CounterID
= Int32(62), CAST(CAST(hits.EventDate AS Int32) AS Date32) >=
Date32("2013-07-01"), CAST(CAST(hits.EventDate AS Int32) AS Date32) <=
Date32("2013-07-31"), hits.IsRefresh = Int16(0), hits.IsLink != Int16(0),
hits.IsDownload = Int16(0)]
|
| physical_plan | GlobalLimitExec: skip=1000, fetch=10
|
| | SortPreservingMergeExec: [pageviews@1 DESC], fetch=1010
|
| | SortExec: TopK(fetch=1010), expr=[pageviews@1 DESC],
preserve_partitioning=[true]
|
| | ProjectionExec: expr=[URL@0 as URL, count(*)@1 as
pageviews]
|
| | AggregateExec: mode=FinalPartitioned, gby=[URL@0
as URL], aggr=[count(*)]
|
| | CoalesceBatchesExec: target_batch_size=8192
|
| | RepartitionExec: partitioning=Hash([URL@0],
16), input_partitions=16
|
| | AggregateExec: mode=Partial, gby=[URL@0 as
URL], aggr=[count(*)]
|
| | CoalesceBatchesExec:
target_batch_size=8192
|
| | FilterExec: CounterID@1 = 62 AND
CAST(CAST(EventDate@0 AS Int32) AS Date32) >= 2013-07-01 AND
CAST(CAST(EventDate@0 AS Int32) AS Date32) <= 2013-07-31 AND IsRefresh@3 = 0
AND IsLink@4 != 0 AND IsDownload@5 = 0, projection=[URL@2]
|
| | ParquetExec: file_groups={16 groups:
[[Users/andrewlamb/Downloads/hits/hits_0.parquet:0..122446530,
Users/andrewlamb/Downloads/hits/hits_1.parquet:0..174965044,
Users/andrewlamb/Downloads/hits/hits_10.parquet:0..101513258,
Users/andrewlamb/Downloads/hits/hits_11.parquet:0..118419888,
Users/andrewlamb/Downloads/hits/hits_12.parquet:0..149514164, ...],
[Users/andrewlamb/Downloads/hits/hits_14.parquet:108113265..151121699,
Users/andrewlamb/Downloads/hits/hits_15.parquet:0..103098894,
Users/andrewlamb/Downloads/hits/hits_16.parquet:0..101067219,
Users/andrewlamb/Downloads/hits/hits_17.parquet:0..116867853,
Users/andrewlamb/Downloads/hits/hits_18.parquet:0..133119589, ...],
[Users/andrewlamb/Downloads/hits/hits_21.parquet:3887560..113455196,
Users/andrewlamb/Downloads/hits/hits_22.parquet:0..79775901,
Users/andrewlamb/Downloads/hits/hits_23.parquet:0..79631107,
Users/andrewlamb/Downloads/hits/hits_24.parquet:0..78257049,
Users/andrewlamb/Downloads/
hits/hits_25.parquet:0..144169728, ...],
[Users/andrewlamb/Downloads/hits/hits_28.parquet:106905624..162772407,
Users/andrewlamb/Downloads/hits/hits_29.parquet:0..79213288,
Users/andrewlamb/Downloads/hits/hits_3.parquet:0..192507052,
Users/andrewlamb/Downloads/hits/hits_30.parquet:0..124187913,
Users/andrewlamb/Downloads/hits/hits_31.parquet:0..123065410, ...],
[Users/andrewlamb/Downloads/hits/hits_35.parquet:54087340..153632381,
Users/andrewlamb/Downloads/hits/hits_36.parquet:0..92487304,
Users/andrewlamb/Downloads/hits/hits_37.parquet:0..108247781,
Users/andrewlamb/Downloads/hits/hits_38.parquet:0..132005180,
Users/andrewlamb/Downloads/hits/hits_39.parquet:0..103522954, ...], ...]},
projection=[EventDate, CounterID, URL, IsRefresh, IsLink, IsDownload],
predicate=CounterID@6 = 62 AND CAST(CAST(EventDate@5 AS Int32) AS Date32) >=
2013-07-01 AND CAST(CAST(EventDate@5 AS Int32) AS Date32) <= 2013-07-31 AND
IsRefresh@15 = 0 AND IsLink@52 != 0 AND IsDownload@53 = 0, pruning_predicate=CA
SE WHEN CounterID_null_count@2 = CounterID_row_count@3 THEN false ELSE
CounterID_min@0 <= 62 AND 62 <= CounterID_max@1 END AND CASE WHEN
IsRefresh_null_count@6 = IsRefresh_row_count@7 THEN false ELSE IsRefresh_min@4
<= 0 AND 0 <= IsRefresh_max@5 END AND CASE WHEN IsLink_null_count@10 =
IsLink_row_count@11 THEN false ELSE IsLink_min@8 != 0 OR 0 != IsLink_max@9 END
AND CASE WHEN IsDownload_null_count@14 = IsDownload_row_count@15 THEN false
ELSE IsDownload_min@12 <= 0 AND 0 <= IsDownload_max@13 END,
required_guarantees=[CounterID in (62), IsDownload in (0), IsLink not in (0),
IsRefresh in (0)] |
| |
|
+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2 row(s) fetched.
Elapsed 0.090 seconds.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
