This is an automated email from the ASF dual-hosted git repository.

morrySnow pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/master by this push:
     new 23e21f44f00 [improve](nereids) filter nereidsPrunedTabletIds per 
partition in distributionPrune (#63851)
23e21f44f00 is described below

commit 23e21f44f0080e25deff5dcdda2e61af3efc9480
Author: Sim Chou <[email protected]>
AuthorDate: Tue Jun 2 10:38:52 2026 +0800

    [improve](nereids) filter nereidsPrunedTabletIds per partition in 
distributionPrune (#63851)
    
    ### What problem does this PR solve?
    #53403 short-circuited distributionPrune to return the entire
    nereidsPrunedTabletIds set when running under Nereids. However, the
    caller computeTabletInfo invokes distributionPrune inside a
    per-partition loop and then iterates the returned ids, calling
    MaterializedIndex.getTablet(id) on each. When nereidsPrunedTabletIds
    contains tablets across many
    partitions, every per-partition iteration walks the entire global set
    and does a getTablet hash lookup on ids that belong to other partitions
    (which are then filtered out by the null check), yielding O(partitionNum
    * globalPrunedSize) lookups. The short-circuit also copies the full
    HashSet into a new ArrayList once per partition.
    
    Filter the global set down to the current partition's tablet ids
    (tabletIdsInOrder, already prepared by the caller) before returning. The
    result is identical to what the caller's null-check would have produced,
    so behavior is unchanged; only the redundant lookups and copies are
    eliminated. The non-Nereids path, the sampleTabletIds path and the
    empty-set
    fallback are untouched.
    
    Issue Number: close #63854
    
    Related PR: #53403
    
    Problem Summary:
    
    Plan time of OlapScan queries with many partitions and many globally
    pruned tablets degrades quadratically due to redundant per-partition
    iterations over the global pruned tablet set in
    OlapScanNode.distributionPrune. Restore per-partition complexity by
    filtering the global set down to the current partition's tablets before
    returning.
    
    Co-authored-by: zhousimin <[email protected]>
---
 .../org/apache/doris/planner/OlapScanNode.java     | 22 +++++++++++++++++-----
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git 
a/fe/fe-core/src/main/java/org/apache/doris/planner/OlapScanNode.java 
b/fe/fe-core/src/main/java/org/apache/doris/planner/OlapScanNode.java
index 1a44788e33c..74af408de59 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/planner/OlapScanNode.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/planner/OlapScanNode.java
@@ -376,9 +376,20 @@ public class OlapScanNode extends ScanNode {
             DistributionInfo distributionInfo,
             boolean pruneTablesByNereids) throws AnalysisException {
         if (pruneTablesByNereids) {
-            return nereidsPrunedTabletIds.isEmpty()
-                    ? null
-                    : new ArrayList<>(nereidsPrunedTabletIds);
+            if (nereidsPrunedTabletIds.isEmpty()) {
+                return null;
+            }
+            // Filter to tablets belonging to this partition. Without this, 
the caller's
+            // per-partition loop in computeTabletInfo becomes O(partitionNum 
* globalPrunedSize)
+            // getTablet hash lookups (most returning null), which dominates 
plan time
+            // when both partition count and pruned tablet count are large.
+            List<Long> result = new ArrayList<>();
+            for (Long id : tabletIdsInOrder) {
+                if (nereidsPrunedTabletIds.contains(id)) {
+                    result.add(id);
+                }
+            }
+            return result;
         }
         DistributionPruner distributionPruner = null;
         switch (distributionInfo.getType()) {
@@ -924,8 +935,9 @@ public class OlapScanNode extends ScanNode {
             boolean notExistsSampleAndPrunedTablets = 
sampleTabletIds.isEmpty() && nereidsPrunedTabletIds.isEmpty();
             if (prunedTabletIds != null) {
                 for (Long id : prunedTabletIds) {
-                    if (selectedTable.getTablet(id) != null) {
-                        tablets.add(selectedTable.getTablet(id));
+                    Tablet tablet = selectedTable.getTablet(id);
+                    if (tablet != null) {
+                        tablets.add(tablet);
                         scanTabletIds.add(id);
                     } else if (notExistsSampleAndPrunedTablets) {
                         // The tabletID specified in query does not exist in 
this partition, skip scan partition.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to