This is an automated email from the ASF dual-hosted git repository.
yiguolei pushed a commit to branch branch-4.1
in repository https://gitbox.apache.org/repos/asf/doris.git
The following commit(s) were added to refs/heads/branch-4.1 by this push:
new 93db088f903 branch-4.1:[improvement](fe) Preload external table
metadata before internal table lock in mixed queries (#64579)
93db088f903 is described below
commit 93db088f903acd412698230357b285d2e3c79da4
Author: Wen Zhenghu <[email protected]>
AuthorDate: Thu Jun 18 16:59:35 2026 +0800
branch-4.1:[improvement](fe) Preload external table metadata before
internal table lock in mixed queries (#64579)
### What problem does this PR solve?
Issue Number: None
Related PR: #64035
Problem Summary:
This PR is the branch-4.1 migration of #64035. It aims to reduce the
time that Doris FE holds internal table plan-time read locks during
Nereids planning.
Problem
In mixed queries that involve both internal tables and external tables,
FE may access external table metadata while internal table read locks
are already held. For some external catalogs, metadata loading is lazy
and may be slow, such as schema initialization or latest snapshot
loading. As a result, the internal table lock can be held much longer
than necessary, which increases the chance of blocking concurrent
operations that need the table write lock.
Solution
This PR introduces an external metadata preload phase before internal
table locks are acquired in Nereids planning.
The main idea is:
1. Collect external tables during relation collection.
2. Preload the metadata that will likely be needed later, before locking
internal tables.
3. Acquire internal table locks only after the preload phase finishes.
4. Continue the normal analysis flow with the preloaded metadata already
cached.
The preload currently covers branch-4.1 external table implementations:
- Hive/Hudi/HMS-Iceberg external tables through `HMSExternalTable`
- Iceberg external tables through `IcebergExternalTable`
- Paimon external tables through `PaimonExternalTable`
- JDBC external tables through `JdbcExternalTable`
For snapshot-based engines, this PR only preloads the latest snapshot
when the query is using the latest view of the table. It does not
preload explicit historical snapshots, branches, tags, or other
non-latest relation forms.
For JDBC catalogs in branch-4.1, the implementation is different from
master. Branch-4.1 still uses `JdbcExternalTable` instead of the newer
`PluginDrivenExternalTable` path, so the JDBC preload capability and
schema-load debug point are implemented on `JdbcExternalTable`. This
preloads JDBC schema metadata before the lock phase, so lazy schema
initialization no longer extends the internal table lock holding window.
### Implementation summary
This PR reduces the internal table plan-time read lock window by moving
eligible external metadata loading ahead of `statementContext.lock()`.
The implementation now follows this flow:
`collect relations -> register external preload candidates -> preload
external metadata -> lock internal tables -> analyze`
The key point is that the external metadata work is no longer triggered
lazily after internal table locks are acquired. Instead, for eligible
external tables, it is executed before the lock stage.
### 1. Preload capability is implemented as a table trait
Instead of hard-coding table type checks in planner logic, preload
capability is declared on `TableIf`:
- `supportsExternalMetadataPreload()`
- `supportsLatestSnapshotPreload()`
This keeps the capability decision close to the table implementation
itself.
Current branch-4.1 coverage is:
- `HMSExternalTable`
- supports preload for Hive/Hudi/HMS-Iceberg
- supports latest-snapshot preload for Hudi and HMS-Iceberg
- `IcebergExternalTable`
- supports preload
- supports latest-snapshot preload
- `PaimonExternalTable`
- supports preload
- supports latest-snapshot preload
- `JdbcExternalTable`
- supports schema preload for JDBC external tables in branch-4.1
### 2. StatementContext only records preload candidates
`StatementContext` records relation-level preload metadata through
`registerExternalTableForPreload(...)` during relation collection.
This metadata is represented by `ExternalTablePreloadInfo`, which tracks
whether the same external table is referenced as:
- a latest relation
- a non-latest relation, for example snapshot / branch / tag /
time-travel style access
This distinction is important because snapshot-aware external tables
should not warm latest schema/partitions when they are referenced only
through non-latest relations.
### 3. Preload execution is implemented as a Nereids analysis rule
The actual preload logic is implemented in a dedicated rule:
`PreloadExternalMetadata`.
This rule runs after relation collection and before internal table locks
are acquired.
It executes at most once per statement context and produces an
`ExternalMetadataPreloadResult`, which records:
- whether preload actually ran
- candidate table count
- preloaded table count
- skip reason
- elapsed time
### 4. Preload is gated by explicit conditions
The preload rule skips execution when any of the following is true:
- `enable_preload_external_metadata` is disabled
- no eligible external preload candidates were collected
- the statement does not involve any internal table that requires a
plan-time read lock
This means the optimization only runs when it can actually help reduce
internal lock holding time.
### 5. Preload behavior is table-type aware
For each external table candidate, preload may do one or more of the
following:
- preload latest snapshot metadata
- preload schema
- preload selected partition metadata
For snapshot-aware tables, latest snapshot/schema/partition warmup is
gated by whether the table is referenced by latest-only relations.
In particular, if a table is referenced only by non-latest relations,
this PR avoids warming the latest schema/partitions. That prevents
useless cache warmup for time-travel / branch / tag queries.
For branch-4.1 JDBC catalogs, the preload path is schema-only through
`JdbcExternalTable#getBaseSchema()`.
### 6. Planner/profile integration was adjusted to avoid double counting
`NereidsPlanner` now reads the preload result produced by the
collect-phase rule and records it into a dedicated profile counter:
- `Nereids Preload External Metadata Time`
At the same time, `Nereids Lock Table Time` was narrowed to cover only
the actual `statementContext.lock()` call.
This avoids double counting preload time into both:
- `Nereids Preload External Metadata Time`
- `Nereids Lock Table Time`
After this change:
- when preload is disabled, external schema initialization can still
show up in `Nereids Analysis Time`
- when preload is enabled, that cost is shifted into `Nereids Preload
External Metadata Time`
### 7. Session variable
This PR introduces the session variable:
- `enable_preload_external_metadata`
It is currently default-off and acts as the main switch for this
optimization.
### Why this helps
Before this change, slow external metadata operations could extend the
duration for which internal table plan-time read locks were held.
After this change, the eligible external metadata work is moved before
lock acquisition, so the internal lock window is shorter and less
sensitive to slow external metadata paths.
### Release note
Improve FE planning by moving external metadata preload ahead of
internal table plan-time read locks.
### Check List (For Author)
- Test: FE Unit Test
- `./run-fe-ut.sh --run
org.apache.doris.nereids.StatementContextTest,org.apache.doris.nereids.NereidsPlannerTest,org.apache.doris.common.profile.SummaryProfileTest,org.apache.doris.qe.SessionVariablesTest`
- Test: Regression Test
- Added
`regression-test/suites/external_table_p0/jdbc/test_preload_external_metadata_profile.groovy`
- Not run locally because it depends on the external Doris docker
regression environment.
- Behavior changed: Yes
- Does this need documentation: No
---
.../java/org/apache/doris/catalog/TableIf.java | 14 +
.../doris/common/profile/SummaryProfile.java | 29 +-
.../apache/doris/datasource/ExternalCatalog.java | 15 +
.../apache/doris/datasource/ExternalDatabase.java | 15 +
.../doris/datasource/hive/HMSExternalTable.java | 12 +
.../datasource/iceberg/IcebergExternalTable.java | 10 +
.../doris/datasource/jdbc/JdbcExternalTable.java | 19 +
.../datasource/paimon/PaimonExternalTable.java | 10 +
.../org/apache/doris/nereids/CascadesContext.java | 6 +-
.../nereids/ExternalMetadataPreloadResult.java | 65 +++
.../doris/nereids/ExternalTablePreloadInfo.java | 55 +++
.../org/apache/doris/nereids/NereidsPlanner.java | 30 +-
.../org/apache/doris/nereids/StatementContext.java | 57 +++
.../executor/TableCollectAndHookInitializer.java | 26 +-
.../org/apache/doris/nereids/rules/RuleType.java | 1 +
.../nereids/rules/analysis/CollectRelation.java | 5 +
.../rules/analysis/PreloadExternalMetadata.java | 134 +++++
.../java/org/apache/doris/qe/SessionVariable.java | 17 +
.../doris/common/profile/SummaryProfileTest.java | 19 +-
.../apache/doris/nereids/NereidsPlannerTest.java | 80 +++
.../apache/doris/nereids/StatementContextTest.java | 546 +++++++++++++++++++++
.../org/apache/doris/qe/SessionVariablesTest.java | 16 +
.../test_preload_external_metadata_profile.groovy | 195 ++++++++
23 files changed, 1366 insertions(+), 10 deletions(-)
diff --git a/fe/fe-core/src/main/java/org/apache/doris/catalog/TableIf.java
b/fe/fe-core/src/main/java/org/apache/doris/catalog/TableIf.java
index da7c8519c9b..0a94ed3ec24 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/catalog/TableIf.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/catalog/TableIf.java
@@ -441,6 +441,20 @@ public interface TableIf {
return false;
}
+ /**
+ * Returns whether the table can preload planning metadata before internal
table locks are acquired.
+ */
+ default boolean supportsExternalMetadataPreload() {
+ return false;
+ }
+
+ /**
+ * Returns whether the table has a meaningful latest snapshot that can be
preloaded ahead of analysis.
+ */
+ default boolean supportsLatestSnapshotPreload() {
+ return false;
+ }
+
/**
* Doris table type.
*/
diff --git
a/fe/fe-core/src/main/java/org/apache/doris/common/profile/SummaryProfile.java
b/fe/fe-core/src/main/java/org/apache/doris/common/profile/SummaryProfile.java
index df8af0ad340..bbc7e499409 100644
---
a/fe/fe-core/src/main/java/org/apache/doris/common/profile/SummaryProfile.java
+++
b/fe/fe-core/src/main/java/org/apache/doris/common/profile/SummaryProfile.java
@@ -103,6 +103,7 @@ public class SummaryProfile {
public static final String GET_TABLE_VERSION_COUNT = "Get Table Version
Count";
public static final String PARSE_SQL_TIME = "Parse SQL Time";
+ public static final String NEREIDS_PRELOAD_EXTERNAL_METADATA_TIME =
"Nereids Preload External Metadata Time";
public static final String NEREIDS_LOCK_TABLE_TIME = "Nereids Lock Table
Time";
public static final String NEREIDS_ANALYSIS_TIME = "Nereids Analysis Time";
public static final String NEREIDS_REWRITE_TIME = "Nereids Rewrite Time";
@@ -160,6 +161,7 @@ public class SummaryProfile {
PARSE_SQL_TIME,
PLAN_TIME,
NEREIDS_GARBAGE_COLLECT_TIME,
+ NEREIDS_PRELOAD_EXTERNAL_METADATA_TIME,
NEREIDS_LOCK_TABLE_TIME,
NEREIDS_ANALYSIS_TIME,
NEREIDS_REWRITE_TIME,
@@ -213,6 +215,7 @@ public class SummaryProfile {
public static ImmutableMap<String, Integer>
EXECUTION_SUMMARY_KEYS_INDENTATION
= ImmutableMap.<String, Integer>builder()
.put(NEREIDS_GARBAGE_COLLECT_TIME, 1)
+ .put(NEREIDS_PRELOAD_EXTERNAL_METADATA_TIME, 1)
.put(NEREIDS_LOCK_TABLE_TIME, 1)
.put(NEREIDS_ANALYSIS_TIME, 1)
.put(NEREIDS_REWRITE_TIME, 1)
@@ -260,6 +263,10 @@ public class SummaryProfile {
private long parseSqlStartTime = -1;
@SerializedName(value = "parseSqlFinishTime")
private long parseSqlFinishTime = -1;
+ @SerializedName(value = "nereidsPreloadExternalMetadataTime")
+ private long nereidsPreloadExternalMetadataTime = 0;
+ @SerializedName(value = "nereidsLockTableStartTime")
+ private long nereidsLockTableStartTime = -1;
@SerializedName(value = "nereidsLockTableFinishTime")
private long nereidsLockTableFinishTime = -1;
@@ -478,6 +485,8 @@ public class SummaryProfile {
executionSummaryProfile.addInfoString(PARSE_SQL_TIME,
getPrettyParseSqlTime());
executionSummaryProfile.addInfoString(PLAN_TIME,
getPrettyTime(queryPlanFinishTime, parseSqlFinishTime,
TUnit.TIME_MS));
+
executionSummaryProfile.addInfoString(NEREIDS_PRELOAD_EXTERNAL_METADATA_TIME,
+ getPrettyNereidsPreloadExternalMetadataTime());
executionSummaryProfile.addInfoString(NEREIDS_LOCK_TABLE_TIME,
getPrettyNereidsLockTableTime());
executionSummaryProfile.addInfoString(NEREIDS_ANALYSIS_TIME,
getPrettyNereidsAnalysisTime());
executionSummaryProfile.addInfoString(NEREIDS_REWRITE_TIME,
getPrettyNereidsRewriteTime());
@@ -571,6 +580,10 @@ public class SummaryProfile {
this.parseSqlFinishTime = parseSqlFinishTime;
}
+ public void setNereidsLockTableStartTime(long lockTableStartTime) {
+ this.nereidsLockTableStartTime = lockTableStartTime;
+ }
+
public void setNereidsLockTableFinishTime(long lockTableFinishTime) {
this.nereidsLockTableFinishTime = lockTableFinishTime;
}
@@ -846,7 +859,11 @@ public class SummaryProfile {
}
public int getNereidsLockTableTimeMs() {
- return getTimeMs(nereidsLockTableFinishTime, parseSqlFinishTime);
+ return getTimeMs(nereidsLockTableFinishTime,
nereidsLockTableStartTime);
+ }
+
+ public long getNereidsPreloadExternalMetadataTimeMs() {
+ return nereidsPreloadExternalMetadataTime;
}
public int getNereidsAnalysisTimeMs() {
@@ -934,8 +951,12 @@ public class SummaryProfile {
return getPrettyTime(parseSqlFinishTime, parseSqlStartTime,
TUnit.TIME_MS);
}
+ public String getPrettyNereidsPreloadExternalMetadataTime() {
+ return RuntimeProfile.printCounter(nereidsPreloadExternalMetadataTime,
TUnit.TIME_MS);
+ }
+
public String getPrettyNereidsLockTableTime() {
- return getPrettyTime(nereidsLockTableFinishTime, parseSqlFinishTime,
TUnit.TIME_MS);
+ return getPrettyTime(nereidsLockTableFinishTime,
nereidsLockTableStartTime, TUnit.TIME_MS);
}
public String getPrettyNereidsAnalysisTime() {
@@ -1148,6 +1169,10 @@ public class SummaryProfile {
this.nereidsMvRewriteTime += ms;
}
+ public void addNereidsPreloadExternalMetadataTime(long ms) {
+ this.nereidsPreloadExternalMetadataTime += ms;
+ }
+
public long getNereidsMvRewriteTimeMs() {
return nereidsMvRewriteTime;
}
diff --git
a/fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalCatalog.java
b/fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalCatalog.java
index 43d22870ef6..8ae2d8dd243 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalCatalog.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalCatalog.java
@@ -34,6 +34,7 @@ import org.apache.doris.common.ThreadPoolManager;
import org.apache.doris.common.UserException;
import org.apache.doris.common.Version;
import org.apache.doris.common.security.authentication.ExecutionAuthenticator;
+import org.apache.doris.common.util.DebugPointUtil;
import org.apache.doris.common.util.Util;
import
org.apache.doris.datasource.connectivity.CatalogConnectivityTestCoordinator;
import org.apache.doris.datasource.doris.RemoteDorisExternalDatabase;
@@ -241,6 +242,20 @@ public abstract class ExternalCatalog
if (metadataOps == null) {
throw new UnsupportedOperationException("List databases is not
supported for catalog: " + getName());
} else {
+ // Allow manual regression to isolate catalog-level metadata
enumeration cost during collect.
+ if
(DebugPointUtil.isEnable("ExternalCatalog.listDatabaseNames.sleep")) {
+ long sleepMs = DebugPointUtil.getDebugParamOrDefault(
+ "ExternalCatalog.listDatabaseNames.sleep", "sleepMs",
0L);
+ if (sleepMs > 0) {
+ LOG.info("debug point
ExternalCatalog.listDatabaseNames.sleep hit for {}, sleep {}ms",
+ getName(), sleepMs);
+ try {
+ Thread.sleep(sleepMs);
+ } catch (InterruptedException ignore) {
+ Thread.currentThread().interrupt();
+ }
+ }
+ }
return metadataOps.listDatabaseNames();
}
}
diff --git
a/fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalDatabase.java
b/fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalDatabase.java
index 1d4c1746503..940fade40a8 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalDatabase.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalDatabase.java
@@ -29,6 +29,7 @@ import org.apache.doris.common.FeConstants;
import org.apache.doris.common.MetaNotFoundException;
import org.apache.doris.common.Pair;
import org.apache.doris.common.lock.MonitoredReentrantReadWriteLock;
+import org.apache.doris.common.util.DebugPointUtil;
import org.apache.doris.common.util.Util;
import org.apache.doris.datasource.infoschema.ExternalInfoSchemaDatabase;
import org.apache.doris.datasource.infoschema.ExternalMysqlDatabase;
@@ -195,6 +196,20 @@ public abstract class ExternalDatabase<T extends
ExternalTable>
})
.collect(Collectors.toList());
} else {
+ // Allow manual regression to isolate database-level table
enumeration cost during collect.
+ if
(DebugPointUtil.isEnable("ExternalDatabase.listTableNames.sleep")) {
+ long sleepMs = DebugPointUtil.getDebugParamOrDefault(
+ "ExternalDatabase.listTableNames.sleep", "sleepMs",
0L);
+ if (sleepMs > 0) {
+ LOG.info("debug point
ExternalDatabase.listTableNames.sleep hit for {}.{}, sleep {}ms",
+ extCatalog.getName(), remoteName, sleepMs);
+ try {
+ Thread.sleep(sleepMs);
+ } catch (InterruptedException ignore) {
+ Thread.currentThread().interrupt();
+ }
+ }
+ }
tableNames = extCatalog.listTableNames(null,
remoteName).stream().map(tableName -> {
String localTableName =
extCatalog.fromRemoteTableName(remoteName, tableName);
if (this.isStoredTableNamesLowerCase()) {
diff --git
a/fe/fe-core/src/main/java/org/apache/doris/datasource/hive/HMSExternalTable.java
b/fe/fe-core/src/main/java/org/apache/doris/datasource/hive/HMSExternalTable.java
index b7add486018..e67e8f43201 100644
---
a/fe/fe-core/src/main/java/org/apache/doris/datasource/hive/HMSExternalTable.java
+++
b/fe/fe-core/src/main/java/org/apache/doris/datasource/hive/HMSExternalTable.java
@@ -419,6 +419,18 @@ public class HMSExternalTable extends ExternalTable
implements MTMVRelatedTableI
return getDlaType() == DLAType.HIVE || getDlaType() == DLAType.HUDI;
}
+ @Override
+ public boolean supportsExternalMetadataPreload() {
+ return true;
+ }
+
+ @Override
+ public boolean supportsLatestSnapshotPreload() {
+ // HMSExternalTable may represent Hive, Hudi, or Iceberg tables.
+ // Only snapshot-aware table types should preload latest snapshot
metadata.
+ return getDlaType() == DLAType.HUDI || getDlaType() == DLAType.ICEBERG;
+ }
+
@Override
public Optional<SortedPartitionRanges<String>>
getSortedPartitionRanges(CatalogRelation scan) {
if (getDlaType() != DLAType.HIVE) {
diff --git
a/fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/IcebergExternalTable.java
b/fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/IcebergExternalTable.java
index cdfd574b8e6..7dfdf6aed92 100644
---
a/fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/IcebergExternalTable.java
+++
b/fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/IcebergExternalTable.java
@@ -311,6 +311,16 @@ public class IcebergExternalTable extends ExternalTable
implements MTMVRelatedTa
return true;
}
+ @Override
+ public boolean supportsExternalMetadataPreload() {
+ return true;
+ }
+
+ @Override
+ public boolean supportsLatestSnapshotPreload() {
+ return true;
+ }
+
@VisibleForTesting
public boolean isValidRelatedTableCached() {
return isValidRelatedTableCached;
diff --git
a/fe/fe-core/src/main/java/org/apache/doris/datasource/jdbc/JdbcExternalTable.java
b/fe/fe-core/src/main/java/org/apache/doris/datasource/jdbc/JdbcExternalTable.java
index 9c33ec6414d..0ba221f5d87 100644
---
a/fe/fe-core/src/main/java/org/apache/doris/datasource/jdbc/JdbcExternalTable.java
+++
b/fe/fe-core/src/main/java/org/apache/doris/datasource/jdbc/JdbcExternalTable.java
@@ -21,6 +21,7 @@ import org.apache.doris.analysis.StatementBase;
import org.apache.doris.catalog.Column;
import org.apache.doris.catalog.JdbcResource;
import org.apache.doris.catalog.JdbcTable;
+import org.apache.doris.common.util.DebugPointUtil;
import org.apache.doris.datasource.ExternalDatabase;
import org.apache.doris.datasource.ExternalTable;
import org.apache.doris.datasource.SchemaCacheValue;
@@ -125,9 +126,27 @@ public class JdbcExternalTable extends ExternalTable {
return jdbcTable.toThrift();
}
+ @Override
+ public boolean supportsExternalMetadataPreload() {
+ return true;
+ }
+
@Override
public Optional<SchemaCacheValue> initSchema() {
String remoteDbName = ((ExternalDatabase<?>)
this.getDatabase()).getRemoteName();
+ if (DebugPointUtil.isEnable("JdbcExternalTable.initSchema.sleep")) {
+ long sleepMs = DebugPointUtil.getDebugParamOrDefault(
+ "JdbcExternalTable.initSchema.sleep", "sleepMs", 0L);
+ if (sleepMs > 0) {
+ LOG.info("debug point JdbcExternalTable.initSchema.sleep hit
for {}.{}, sleep {}ms",
+ remoteDbName, getRemoteName(), sleepMs);
+ try {
+ Thread.sleep(sleepMs);
+ } catch (InterruptedException ignore) {
+ Thread.currentThread().interrupt();
+ }
+ }
+ }
// 1. Retrieve remote column information
List<Column> columns = ((JdbcExternalCatalog)
catalog).listColumns(remoteDbName, remoteName);
diff --git
a/fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonExternalTable.java
b/fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonExternalTable.java
index 78639761ede..1775a984f2c 100644
---
a/fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonExternalTable.java
+++
b/fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonExternalTable.java
@@ -319,6 +319,16 @@ public class PaimonExternalTable extends ExternalTable
implements MTMVRelatedTab
return true;
}
+ @Override
+ public boolean supportsExternalMetadataPreload() {
+ return true;
+ }
+
+ @Override
+ public boolean supportsLatestSnapshotPreload() {
+ return true;
+ }
+
@Override
public List<Column> getFullSchema() {
return
getPaimonSchemaCacheValue(MvccUtil.getSnapshotFromContext(this)).getSchema();
diff --git
a/fe/fe-core/src/main/java/org/apache/doris/nereids/CascadesContext.java
b/fe/fe-core/src/main/java/org/apache/doris/nereids/CascadesContext.java
index af108aafea3..257315a496e 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/nereids/CascadesContext.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/CascadesContext.java
@@ -289,7 +289,11 @@ public class CascadesContext implements ScheduleContext {
}
public TableCollectAndHookInitializer newTableCollector(boolean
firstLevel) {
- return new TableCollectAndHookInitializer(this, firstLevel);
+ return newTableCollector(firstLevel, false);
+ }
+
+ public TableCollectAndHookInitializer newTableCollector(boolean
firstLevel, boolean enablePreloadRule) {
+ return new TableCollectAndHookInitializer(this, firstLevel,
enablePreloadRule);
}
public Analyzer newAnalyzer() {
diff --git
a/fe/fe-core/src/main/java/org/apache/doris/nereids/ExternalMetadataPreloadResult.java
b/fe/fe-core/src/main/java/org/apache/doris/nereids/ExternalMetadataPreloadResult.java
new file mode 100644
index 00000000000..c6cea9916d4
--- /dev/null
+++
b/fe/fe-core/src/main/java/org/apache/doris/nereids/ExternalMetadataPreloadResult.java
@@ -0,0 +1,65 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package org.apache.doris.nereids;
+
+/** Summarizes whether external metadata preload ran and what it processed. */
+public class ExternalMetadataPreloadResult {
+ private final boolean executed;
+ private final int candidateTableCount;
+ private final int preloadedTableCount;
+ private final String skipReason;
+ private final long elapsedTimeMs;
+
+ private ExternalMetadataPreloadResult(boolean executed, int
candidateTableCount,
+ int preloadedTableCount, String skipReason, long elapsedTimeMs) {
+ this.executed = executed;
+ this.candidateTableCount = candidateTableCount;
+ this.preloadedTableCount = preloadedTableCount;
+ this.skipReason = skipReason;
+ this.elapsedTimeMs = elapsedTimeMs;
+ }
+
+ public static ExternalMetadataPreloadResult executed(
+ int candidateTableCount, int preloadedTableCount, long
elapsedTimeMs) {
+ return new ExternalMetadataPreloadResult(true, candidateTableCount,
preloadedTableCount, "", elapsedTimeMs);
+ }
+
+ public static ExternalMetadataPreloadResult skipped(int
candidateTableCount, String skipReason) {
+ return new ExternalMetadataPreloadResult(false, candidateTableCount,
0, skipReason, 0);
+ }
+
+ public boolean isExecuted() {
+ return executed;
+ }
+
+ public int getCandidateTableCount() {
+ return candidateTableCount;
+ }
+
+ public int getPreloadedTableCount() {
+ return preloadedTableCount;
+ }
+
+ public String getSkipReason() {
+ return skipReason;
+ }
+
+ public long getElapsedTimeMs() {
+ return elapsedTimeMs;
+ }
+}
diff --git
a/fe/fe-core/src/main/java/org/apache/doris/nereids/ExternalTablePreloadInfo.java
b/fe/fe-core/src/main/java/org/apache/doris/nereids/ExternalTablePreloadInfo.java
new file mode 100644
index 00000000000..715b8c232c1
--- /dev/null
+++
b/fe/fe-core/src/main/java/org/apache/doris/nereids/ExternalTablePreloadInfo.java
@@ -0,0 +1,55 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package org.apache.doris.nereids;
+
+import org.apache.doris.datasource.ExternalTable;
+
+/** Tracks how a single external table is referenced before metadata preload
happens. */
+public class ExternalTablePreloadInfo {
+ private final ExternalTable table;
+ private boolean hasLatestOnlyRelation;
+ private boolean hasNonLatestRelation;
+
+ public ExternalTablePreloadInfo(ExternalTable table) {
+ this.table = table;
+ }
+
+ public ExternalTable getTable() {
+ return table;
+ }
+
+ public void markLatestRelation() {
+ hasLatestOnlyRelation = true;
+ }
+
+ public void markNonLatestRelation() {
+ hasNonLatestRelation = true;
+ }
+
+ public boolean hasLatestOnlyRelation() {
+ return hasLatestOnlyRelation;
+ }
+
+ public boolean hasNonLatestRelation() {
+ return hasNonLatestRelation;
+ }
+
+ public boolean shouldPreloadLatestSnapshot() {
+ return hasLatestOnlyRelation && !hasNonLatestRelation;
+ }
+}
diff --git
a/fe/fe-core/src/main/java/org/apache/doris/nereids/NereidsPlanner.java
b/fe/fe-core/src/main/java/org/apache/doris/nereids/NereidsPlanner.java
index 69af5dac8d9..4d3c5cf9c59 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/nereids/NereidsPlanner.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/NereidsPlanner.java
@@ -394,7 +394,35 @@ public class NereidsPlanner extends Planner {
if (LOG.isDebugEnabled()) {
LOG.debug("Start collect and lock table");
}
- keepOrShowPlanProcess(showPlanProcess, () ->
cascadesContext.newTableCollector(true).collect());
+ keepOrShowPlanProcess(showPlanProcess, () ->
cascadesContext.newTableCollector(true, true).collect());
+ // Read the preload result produced by the collect-phase rule before
taking internal table locks.
+ ExternalMetadataPreloadResult preloadResult =
statementContext.getExternalMetadataPreloadResult()
+ .orElse(ExternalMetadataPreloadResult.skipped(
+
statementContext.getExternalTablePreloadCandidateCount(), "preload rule did not
run"));
+ // Record preload timing in the query profile as a dedicated planner
sub-stage.
+ if (statementContext.getConnectContext().getExecutor() != null &&
preloadResult.isExecuted()) {
+
statementContext.getConnectContext().getExecutor().getSummaryProfile()
+
.addNereidsPreloadExternalMetadataTime(preloadResult.getElapsedTimeMs());
+ }
+ // Keep a concise debug summary for the entire preload phase.
+ if (LOG.isDebugEnabled()) {
+ if (preloadResult.isExecuted()) {
+ LOG.debug("{} preloaded external metadata for {} of {}
candidate tables in {} ms",
+
statementContext.getConnectContext().getQueryIdentifier(),
+ preloadResult.getPreloadedTableCount(),
+ preloadResult.getCandidateTableCount(),
+ preloadResult.getElapsedTimeMs());
+ } else {
+ LOG.debug("{} skip external metadata preload before lock: {}
[candidateTableCount={}]",
+
statementContext.getConnectContext().getQueryIdentifier(),
preloadResult.getSkipReason(),
+ preloadResult.getCandidateTableCount());
+ }
+ }
+ if (statementContext.getConnectContext().getExecutor() != null) {
+ // Track only the actual lock() call here so the dedicated preload
stage is not double counted.
+
statementContext.getConnectContext().getExecutor().getSummaryProfile()
+ .setNereidsLockTableStartTime(TimeUtils.getStartTimeMs());
+ }
statementContext.lock();
cascadesContext.setCteContext(new CTEContext());
NereidsTracer.logImportantTime("EndCollectAndLockTables");
diff --git
a/fe/fe-core/src/main/java/org/apache/doris/nereids/StatementContext.java
b/fe/fe-core/src/main/java/org/apache/doris/nereids/StatementContext.java
index d57f975384e..8e70ba5e41c 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/nereids/StatementContext.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/StatementContext.java
@@ -30,6 +30,7 @@ import org.apache.doris.catalog.View;
import org.apache.doris.common.Id;
import org.apache.doris.common.IdGenerator;
import org.apache.doris.common.Pair;
+import org.apache.doris.datasource.ExternalTable;
import org.apache.doris.datasource.mvcc.MvccSnapshot;
import org.apache.doris.datasource.mvcc.MvccTable;
import org.apache.doris.datasource.mvcc.MvccTableInfo;
@@ -84,6 +85,7 @@ import java.io.Closeable;
import java.util.ArrayList;
import java.util.BitSet;
import java.util.Collection;
+import java.util.Collections;
import java.util.Comparator;
import java.util.HashMap;
import java.util.HashSet;
@@ -261,6 +263,9 @@ public class StatementContext implements Closeable {
private Backend groupCommitMergeBackend;
private final Map<MvccTableInfo, MvccSnapshot> snapshots =
Maps.newHashMap();
+ // Record external tables that can be preloaded before internal table
locks are acquired.
+ private final Map<Long, ExternalTablePreloadInfo>
externalTablePreloadInfos = new LinkedHashMap<>();
+ private ExternalMetadataPreloadResult externalMetadataPreloadResult;
private boolean privChecked;
@@ -462,6 +467,27 @@ public class StatementContext implements Closeable {
return connectContext;
}
+ /**
+ * Register an external relation that may preload metadata before internal
table locks are acquired.
+ *
+ * @param table external table referenced by the relation
+ * @param tableSnapshot optional explicit snapshot specification on the
relation
+ * @param scanParams optional relation scan parameters such as branch or
tag
+ */
+ public void registerExternalTableForPreload(TableIf table,
Optional<TableSnapshot> tableSnapshot,
+ Optional<TableScanParams> scanParams) {
+ if (!(table instanceof ExternalTable) ||
!table.supportsExternalMetadataPreload()) {
+ return;
+ }
+ ExternalTablePreloadInfo preloadInfo =
externalTablePreloadInfos.computeIfAbsent(table.getId(),
+ id -> new ExternalTablePreloadInfo((ExternalTable) table));
+ if (tableSnapshot.isPresent() || scanParams.isPresent()) {
+ preloadInfo.markNonLatestRelation();
+ } else {
+ preloadInfo.markLatestRelation();
+ }
+ }
+
public void setOriginStatement(OriginStatement originStatement) {
this.originStatement = originStatement;
if (originStatement != null && sqlCacheContext != null) {
@@ -939,6 +965,37 @@ public class StatementContext implements Closeable {
snapshots.put(mvccTableInfo, snapshot);
}
+ public Collection<ExternalTablePreloadInfo> getExternalTablePreloadInfos()
{
+ return
Collections.unmodifiableCollection(externalTablePreloadInfos.values());
+ }
+
+ public int getExternalTablePreloadCandidateCount() {
+ return externalTablePreloadInfos.size();
+ }
+
+ public boolean hasAnyPlanReadLockTable() {
+ return containsPlanReadLockTable(tables.values())
+ || containsPlanReadLockTable(mtmvRelatedTables.values())
+ || containsPlanReadLockTable(insertTargetTables.values());
+ }
+
+ public Optional<ExternalMetadataPreloadResult>
getExternalMetadataPreloadResult() {
+ return Optional.ofNullable(externalMetadataPreloadResult);
+ }
+
+ public void setExternalMetadataPreloadResult(ExternalMetadataPreloadResult
result) {
+ this.externalMetadataPreloadResult = result;
+ }
+
+ private boolean containsPlanReadLockTable(Collection<TableIf> tableIfs) {
+ for (TableIf tableIf : tableIfs) {
+ if (tableIf.needReadLockWhenPlan()) {
+ return true;
+ }
+ }
+ return false;
+ }
+
private static class CloseableResource implements Closeable {
public final String resourceName;
public final String threadName;
diff --git
a/fe/fe-core/src/main/java/org/apache/doris/nereids/jobs/executor/TableCollectAndHookInitializer.java
b/fe/fe-core/src/main/java/org/apache/doris/nereids/jobs/executor/TableCollectAndHookInitializer.java
index 02b2dc76fb5..c796a742097 100644
---
a/fe/fe-core/src/main/java/org/apache/doris/nereids/jobs/executor/TableCollectAndHookInitializer.java
+++
b/fe/fe-core/src/main/java/org/apache/doris/nereids/jobs/executor/TableCollectAndHookInitializer.java
@@ -21,6 +21,7 @@ import org.apache.doris.nereids.CascadesContext;
import org.apache.doris.nereids.jobs.rewrite.RewriteJob;
import org.apache.doris.nereids.rules.analysis.AddInitMaterializationHook;
import org.apache.doris.nereids.rules.analysis.CollectRelation;
+import org.apache.doris.nereids.rules.analysis.PreloadExternalMetadata;
import org.apache.doris.nereids.trees.plans.logical.LogicalView;
import com.google.common.collect.ImmutableSet;
@@ -35,14 +36,22 @@ public class TableCollectAndHookInitializer extends
AbstractBatchJobExecutor {
public final List<RewriteJob> collectJobs;
+ /**
+ * Keep the legacy collector entry point for nested collect passes that
should not trigger preload.
+ */
+ public TableCollectAndHookInitializer(CascadesContext cascadesContext,
boolean firstLevel) {
+ this(cascadesContext, firstLevel, false);
+ }
+
/**
* constructor of Analyzer. For view, we only do bind relation since other
analyze step will do by outer Analyzer.
*
* @param cascadesContext current context for analyzer
*/
- public TableCollectAndHookInitializer(CascadesContext cascadesContext,
boolean firstLevel) {
+ public TableCollectAndHookInitializer(
+ CascadesContext cascadesContext, boolean firstLevel, boolean
enablePreloadRule) {
super(cascadesContext);
- collectJobs = buildCollectTableJobs(firstLevel);
+ collectJobs = buildCollectTableJobs(firstLevel, enablePreloadRule);
}
@Override
@@ -57,14 +66,21 @@ public class TableCollectAndHookInitializer extends
AbstractBatchJobExecutor {
execute();
}
- private static List<RewriteJob> buildCollectTableJobs(boolean firstLevel) {
+ private static List<RewriteJob> buildCollectTableJobs(boolean firstLevel,
boolean enablePreloadRule) {
return notTraverseChildrenOf(
ImmutableSet.of(LogicalView.class),
- () ->
TableCollectAndHookInitializer.buildCollectorJobs(firstLevel)
+ () ->
TableCollectAndHookInitializer.buildCollectorJobs(firstLevel, enablePreloadRule)
);
}
- private static List<RewriteJob> buildCollectorJobs(boolean firstLevel) {
+ private static List<RewriteJob> buildCollectorJobs(boolean firstLevel,
boolean enablePreloadRule) {
+ if (enablePreloadRule) {
+ return jobs(
+ topDown(new AddInitMaterializationHook()),
+ topDown(new CollectRelation(firstLevel)),
+ topDown(new PreloadExternalMetadata())
+ );
+ }
return jobs(
topDown(new AddInitMaterializationHook()),
topDown(new CollectRelation(firstLevel))
diff --git
a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/RuleType.java
b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/RuleType.java
index 228ea72d41f..749f8e1093a 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/RuleType.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/RuleType.java
@@ -48,6 +48,7 @@ public enum RuleType {
INIT_MATERIALIZATION_HOOK_FOR_FILE_SINK(RuleTypeClass.REWRITE),
INIT_MATERIALIZATION_HOOK_FOR_TABLE_SINK(RuleTypeClass.REWRITE),
INIT_MATERIALIZATION_HOOK_FOR_RESULT_SINK(RuleTypeClass.REWRITE),
+ PRELOAD_EXTERNAL_METADATA(RuleTypeClass.REWRITE),
BINDING_INSERT_FILE(RuleTypeClass.REWRITE),
BINDING_ONE_ROW_RELATION_SLOT(RuleTypeClass.REWRITE),
BINDING_RELATION(RuleTypeClass.REWRITE),
diff --git
a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/analysis/CollectRelation.java
b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/analysis/CollectRelation.java
index fee51734485..c5e291072d0 100644
---
a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/analysis/CollectRelation.java
+++
b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/analysis/CollectRelation.java
@@ -208,6 +208,11 @@ public class CollectRelation implements
AnalysisRuleFactory {
} else {
StatementContext statementContext =
cascadesContext.getConnectContext().getStatementContext();
table = statementContext.getAndCacheTable(tableQualifier,
tableFrom, unboundRelation);
+ // Record relation-level metadata so the planner can preload
latest external metadata before locking.
+ if (tableFrom == TableFrom.QUERY && unboundRelation.isPresent()) {
+ statementContext.registerExternalTableForPreload(table,
unboundRelation.get().getTableSnapshot(),
+
Optional.ofNullable(unboundRelation.get().getScanParams()));
+ }
if (firstLevel) {
statementContext.getOneLevelTables().put(tableQualifier,
table);
}
diff --git
a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/analysis/PreloadExternalMetadata.java
b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/analysis/PreloadExternalMetadata.java
new file mode 100644
index 00000000000..524a8b6e79d
--- /dev/null
+++
b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/analysis/PreloadExternalMetadata.java
@@ -0,0 +1,134 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package org.apache.doris.nereids.rules.analysis;
+
+import org.apache.doris.common.util.TimeUtils;
+import org.apache.doris.datasource.ExternalTable;
+import org.apache.doris.nereids.ExternalMetadataPreloadResult;
+import org.apache.doris.nereids.ExternalTablePreloadInfo;
+import org.apache.doris.nereids.StatementContext;
+import org.apache.doris.nereids.rules.Rule;
+import org.apache.doris.nereids.rules.RuleType;
+import org.apache.doris.qe.ConnectContext;
+
+import com.google.common.collect.ImmutableList;
+import org.apache.logging.log4j.LogManager;
+import org.apache.logging.log4j.Logger;
+
+import java.util.List;
+import java.util.Optional;
+
+/**
+ * Preload external metadata after relation collection and before internal
table locks are acquired.
+ */
+public class PreloadExternalMetadata implements AnalysisRuleFactory {
+ private static final Logger LOG =
LogManager.getLogger(PreloadExternalMetadata.class);
+
+ @Override
+ public List<Rule> buildRules() {
+ return ImmutableList.of(
+ any().thenApply(ctx -> {
+ StatementContext statementContext = ctx.statementContext;
+ // Run preload at most once even if the collect pipeline
re-enters the same statement context.
+ if
(!statementContext.getExternalMetadataPreloadResult().isPresent()) {
+
statementContext.setExternalMetadataPreloadResult(executePreload(statementContext));
+ }
+ return ctx.root;
+ }).toRule(RuleType.PRELOAD_EXTERNAL_METADATA)
+ );
+ }
+
+ /**
+ * Execute external metadata preload after relation collection and before
internal table locks.
+ */
+ public ExternalMetadataPreloadResult executePreload(StatementContext
statementContext) {
+ long preloadStartTime = TimeUtils.getStartTimeMs();
+ Optional<String> skipReason = getSkipReason(statementContext);
+ if (skipReason.isPresent()) {
+ if (LOG.isDebugEnabled()) {
+ LOG.debug("{} skip external metadata preload before lock: {}",
+ getPreloadQueryIdentifier(statementContext),
skipReason.get());
+ }
+ return ExternalMetadataPreloadResult.skipped(
+ statementContext.getExternalTablePreloadCandidateCount(),
skipReason.get());
+ }
+ int preloadedTableCount = 0;
+ for (ExternalTablePreloadInfo preloadInfo :
statementContext.getExternalTablePreloadInfos()) {
+ if (preloadExternalTable(statementContext, preloadInfo)) {
+ preloadedTableCount++;
+ }
+ }
+ return ExternalMetadataPreloadResult.executed(
+ statementContext.getExternalTablePreloadCandidateCount(),
+ preloadedTableCount,
+ TimeUtils.getElapsedTimeMs(preloadStartTime));
+ }
+
+ private Optional<String> getSkipReason(StatementContext statementContext) {
+ ConnectContext connectContext = statementContext.getConnectContext();
+ if (connectContext == null || connectContext.getSessionVariable() ==
null
+ ||
!connectContext.getSessionVariable().isEnablePreloadExternalMetadata()) {
+ return Optional.of("session variable
enable_preload_external_metadata is disabled");
+ }
+ if (statementContext.getExternalTablePreloadCandidateCount() == 0) {
+ return Optional.of("no external preload candidates were
collected");
+ }
+ if (!statementContext.hasAnyPlanReadLockTable()) {
+ return Optional.of("no internal tables require plan-time read
lock");
+ }
+ return Optional.empty();
+ }
+
+ private boolean preloadExternalTable(StatementContext statementContext,
ExternalTablePreloadInfo preloadInfo) {
+ ExternalTable table = preloadInfo.getTable();
+ long preloadStartTime = TimeUtils.getStartTimeMs();
+ boolean supportsLatestSnapshot = table.supportsLatestSnapshotPreload();
+ boolean latestOnlyRelation = preloadInfo.shouldPreloadLatestSnapshot();
+ boolean preloadLatestSnapshot = latestOnlyRelation &&
supportsLatestSnapshot;
+ // Skip schema and partition warmup for snapshot-aware tables when
only non-latest relations are referenced.
+ boolean preloadSchema = !supportsLatestSnapshot || latestOnlyRelation;
+ boolean preloadPartition = preloadSchema &&
table.supportInternalPartitionPruned();
+ if (preloadLatestSnapshot) {
+ statementContext.loadSnapshots(table, Optional.empty(),
Optional.empty());
+ }
+ if (preloadSchema) {
+ table.getBaseSchema();
+ }
+ if (preloadPartition) {
+ table.initSelectedPartitions(statementContext.getSnapshot(table));
+ }
+ if (LOG.isDebugEnabled()) {
+ LOG.debug("{} preloaded external metadata for table {} "
+ + "[supportsLatestSnapshot={},
preloadLatestSnapshot={}, preloadSchema={}, "
+ + "preloadPartition={}, hasLatestRelation={},
hasNonLatestRelation={}, elapsedMs={}]",
+ getPreloadQueryIdentifier(statementContext),
getExternalTableLogName(table), supportsLatestSnapshot,
+ preloadLatestSnapshot, preloadSchema, preloadPartition,
preloadInfo.hasLatestOnlyRelation(),
+ preloadInfo.hasNonLatestRelation(),
TimeUtils.getElapsedTimeMs(preloadStartTime));
+ }
+ return preloadLatestSnapshot || preloadSchema || preloadPartition;
+ }
+
+ private String getPreloadQueryIdentifier(StatementContext
statementContext) {
+ ConnectContext connectContext = statementContext.getConnectContext();
+ return connectContext == null ? "stmt[unknown]" :
connectContext.getQueryIdentifier();
+ }
+
+ private String getExternalTableLogName(ExternalTable table) {
+ return table.getCatalog().getName() + "." + table.getDbName() + "." +
table.getName();
+ }
+}
diff --git a/fe/fe-core/src/main/java/org/apache/doris/qe/SessionVariable.java
b/fe/fe-core/src/main/java/org/apache/doris/qe/SessionVariable.java
index 9242527f155..81feb7242ab 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/qe/SessionVariable.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/qe/SessionVariable.java
@@ -388,6 +388,7 @@ public class SessionVariable implements Serializable,
Writable {
public static final String NTH_OPTIMIZED_PLAN = "nth_optimized_plan";
public static final String ENABLE_NEREIDS_PLANNER =
"enable_nereids_planner";
+ public static final String ENABLE_PRELOAD_EXTERNAL_METADATA =
"enable_preload_external_metadata";
public static final String ENABLE_NEREIDS_DISTRIBUTE_PLANNER =
"enable_nereids_distribute_planner";
public static final String DISABLE_NEREIDS_RULES = "disable_nereids_rules";
public static final String ENABLE_NEREIDS_RULES = "enable_nereids_rules";
@@ -1982,6 +1983,14 @@ public class SessionVariable implements Serializable,
Writable {
@VariableMgr.VarAttr(name = ENABLE_NEREIDS_PLANNER, needForward = true,
varType = VariableAnnotation.REMOVED)
private boolean enableNereidsPlanner = true;
+ @VariableMgr.VarAttr(name = ENABLE_PRELOAD_EXTERNAL_METADATA,
+ needForward = true, fuzzy = false, varType =
VariableAnnotation.EXPERIMENTAL, description = {
+ "是否在获取内表规划期读锁前预加载 Hive/Hudi/Iceberg/Paimon/JDBC 外表元数据",
+ "Whether to preload Hive/Hudi/Iceberg/Paimon/JDBC external
table metadata before internal table "
+ + "plan-time read locks are acquired"
+ })
+ private boolean enablePreloadExternalMetadata = false;
+
@VariableMgr.VarAttr(name = DISABLE_NEREIDS_RULES, needForward = true)
private String disableNereidsRules = "";
@@ -4414,6 +4423,14 @@ public class SessionVariable implements Serializable,
Writable {
return enableSqlCache;
}
+ public boolean isEnablePreloadExternalMetadata() {
+ return enablePreloadExternalMetadata;
+ }
+
+ public void setEnablePreloadExternalMetadata(boolean enablePreload) {
+ this.enablePreloadExternalMetadata = enablePreload;
+ }
+
public void setEnableSqlCache(boolean enableSqlCache) {
this.enableSqlCache = enableSqlCache;
}
diff --git
a/fe/fe-core/src/test/java/org/apache/doris/common/profile/SummaryProfileTest.java
b/fe/fe-core/src/test/java/org/apache/doris/common/profile/SummaryProfileTest.java
index 24cd69915b5..ef6fd3d63f3 100644
---
a/fe/fe-core/src/test/java/org/apache/doris/common/profile/SummaryProfileTest.java
+++
b/fe/fe-core/src/test/java/org/apache/doris/common/profile/SummaryProfileTest.java
@@ -29,6 +29,7 @@ public class SummaryProfileTest {
profile.setQueryBeginTime(1);
profile.setParseSqlStartTime(3);
profile.setParseSqlFinishTime(6);
+ profile.setNereidsLockTableStartTime(8);
profile.setNereidsLockTableFinishTime(10);
profile.setNereidsAnalysisTime(15);
profile.setNereidsRewriteTime(21);
@@ -41,6 +42,8 @@ public class SummaryProfileTest {
profile.setQueryScheduleFinishTime(78);
profile.setQueryFetchResultFinishTime(91);
+ // Record the standalone preload stage before the planner takes
internal table locks.
+ profile.addNereidsPreloadExternalMetadataTime(2);
profile.addCollectTablePartitionTime(7);
// update summary time
profile.update(ImmutableMap.of());
@@ -48,7 +51,9 @@ public class SummaryProfileTest {
RuntimeProfile executionSummary = profile.getExecutionSummary();
Assertions.assertEquals(executionSummary.getInfoString(SummaryProfile.PARSE_SQL_TIME),
"3ms");
Assertions.assertEquals(executionSummary.getInfoString(SummaryProfile.PLAN_TIME),
"60ms");
-
Assertions.assertEquals(executionSummary.getInfoString(SummaryProfile.NEREIDS_LOCK_TABLE_TIME),
"4ms");
+ Assertions.assertEquals(executionSummary.getInfoString(
+ SummaryProfile.NEREIDS_PRELOAD_EXTERNAL_METADATA_TIME), "2ms");
+
Assertions.assertEquals(executionSummary.getInfoString(SummaryProfile.NEREIDS_LOCK_TABLE_TIME),
"2ms");
Assertions.assertEquals(executionSummary.getInfoString(SummaryProfile.NEREIDS_ANALYSIS_TIME),
"5ms");
Assertions.assertEquals(executionSummary.getInfoString(SummaryProfile.NEREIDS_REWRITE_TIME),
"6ms");
@@ -60,4 +65,16 @@ public class SummaryProfileTest {
Assertions.assertEquals(executionSummary.getInfoString(SummaryProfile.SCHEDULE_TIME),
"12ms");
Assertions.assertEquals(executionSummary.getInfoString(SummaryProfile.WAIT_FETCH_RESULT_TIME),
"13ms");
}
+
+ @Test
+ public void testPreloadExternalMetadataTimeCounter() {
+ SummaryProfile profile = new SummaryProfile();
+
+ // Verify the dedicated preload counter is accumulated independently
from other planner stages.
+ profile.addNereidsPreloadExternalMetadataTime(12);
+ profile.addNereidsPreloadExternalMetadataTime(8);
+
+ Assertions.assertEquals(20,
profile.getNereidsPreloadExternalMetadataTimeMs());
+ Assertions.assertEquals("20ms",
profile.getPrettyNereidsPreloadExternalMetadataTime());
+ }
}
diff --git
a/fe/fe-core/src/test/java/org/apache/doris/nereids/NereidsPlannerTest.java
b/fe/fe-core/src/test/java/org/apache/doris/nereids/NereidsPlannerTest.java
new file mode 100644
index 00000000000..af93c18093c
--- /dev/null
+++ b/fe/fe-core/src/test/java/org/apache/doris/nereids/NereidsPlannerTest.java
@@ -0,0 +1,80 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package org.apache.doris.nereids;
+
+import org.apache.doris.common.jmockit.Deencapsulation;
+import org.apache.doris.common.profile.SummaryProfile;
+import org.apache.doris.nereids.jobs.executor.TableCollectAndHookInitializer;
+import org.apache.doris.qe.ConnectContext;
+import org.apache.doris.qe.OriginStatement;
+import org.apache.doris.qe.StmtExecutor;
+
+import org.junit.jupiter.api.Test;
+import org.mockito.Mockito;
+
+public class NereidsPlannerTest {
+
+ @Test
+ public void testCollectAndLockTableRecordsPreloadTimeWhenExecuted() {
+ SummaryProfile summaryProfile = Mockito.mock(SummaryProfile.class);
+ NereidsPlanner planner = createPlanner(
+ ExternalMetadataPreloadResult.executed(2, 1, 123L),
summaryProfile);
+
+ planner.collectAndLockTable(false);
+
+
Mockito.verify(summaryProfile).addNereidsPreloadExternalMetadataTime(Mockito.anyLong());
+
Mockito.verify(summaryProfile).setNereidsLockTableStartTime(Mockito.anyLong());
+
Mockito.verify(summaryProfile).setNereidsLockTableFinishTime(Mockito.anyLong());
+ }
+
+ @Test
+ public void testCollectAndLockTableSkipsPreloadCounterWhenNotExecuted() {
+ SummaryProfile summaryProfile = Mockito.mock(SummaryProfile.class);
+ NereidsPlanner planner = createPlanner(
+ ExternalMetadataPreloadResult.skipped(1, "skip preload"),
summaryProfile);
+
+ planner.collectAndLockTable(false);
+
+ Mockito.verify(summaryProfile,
Mockito.never()).addNereidsPreloadExternalMetadataTime(Mockito.anyLong());
+
Mockito.verify(summaryProfile).setNereidsLockTableStartTime(Mockito.anyLong());
+
Mockito.verify(summaryProfile).setNereidsLockTableFinishTime(Mockito.anyLong());
+ }
+
+ private NereidsPlanner createPlanner(ExternalMetadataPreloadResult
preloadResult, SummaryProfile summaryProfile) {
+ ConnectContext connectContext = Mockito.mock(ConnectContext.class);
+ StmtExecutor executor = Mockito.mock(StmtExecutor.class);
+ StatementContext statementContext = Mockito.spy(
+ new StatementContext(connectContext, new
OriginStatement("select 1", 0)));
+ CascadesContext cascadesContext = Mockito.mock(CascadesContext.class);
+ TableCollectAndHookInitializer tableCollector =
Mockito.mock(TableCollectAndHookInitializer.class);
+
+ Mockito.when(connectContext.getExecutor()).thenReturn(executor);
+
Mockito.when(connectContext.getQueryIdentifier()).thenReturn("query-1");
+ Mockito.when(executor.getSummaryProfile()).thenReturn(summaryProfile);
+ Mockito.when(cascadesContext.newTableCollector(true,
true)).thenReturn(tableCollector);
+ Mockito.doAnswer(invocation -> {
+ statementContext.setExternalMetadataPreloadResult(preloadResult);
+ return null;
+ }).when(tableCollector).collect();
+ Mockito.doNothing().when(statementContext).lock();
+
+ NereidsPlanner planner = new NereidsPlanner(statementContext);
+ Deencapsulation.setField(planner, "cascadesContext", cascadesContext);
+ return planner;
+ }
+}
diff --git
a/fe/fe-core/src/test/java/org/apache/doris/nereids/StatementContextTest.java
b/fe/fe-core/src/test/java/org/apache/doris/nereids/StatementContextTest.java
new file mode 100644
index 00000000000..6df0d4693cb
--- /dev/null
+++
b/fe/fe-core/src/test/java/org/apache/doris/nereids/StatementContextTest.java
@@ -0,0 +1,546 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package org.apache.doris.nereids;
+
+import org.apache.doris.analysis.TableSnapshot;
+import org.apache.doris.catalog.DatabaseIf;
+import org.apache.doris.catalog.TableIf;
+import org.apache.doris.datasource.CatalogIf;
+import org.apache.doris.datasource.ExternalTable;
+import org.apache.doris.datasource.hive.HMSExternalTable;
+import org.apache.doris.datasource.hive.HMSExternalTable.DLAType;
+import org.apache.doris.datasource.iceberg.IcebergExternalTable;
+import org.apache.doris.datasource.jdbc.JdbcExternalTable;
+import org.apache.doris.datasource.mvcc.MvccSnapshot;
+import org.apache.doris.datasource.paimon.PaimonExternalTable;
+import org.apache.doris.nereids.rules.analysis.PreloadExternalMetadata;
+import
org.apache.doris.nereids.trees.plans.logical.LogicalFileScan.SelectedPartitions;
+import org.apache.doris.qe.ConnectContext;
+import org.apache.doris.qe.OriginStatement;
+import org.apache.doris.qe.SessionVariable;
+
+import com.google.common.collect.ImmutableList;
+import org.junit.jupiter.api.Test;
+import org.mockito.InOrder;
+import org.mockito.Mockito;
+
+import java.util.Collections;
+import java.util.Optional;
+
+public class StatementContextTest {
+
+ @Test
+ public void testPreloadExternalTablesBeforeLock() {
+ ConnectContext connectContext = Mockito.mock(ConnectContext.class);
+ TableIf internalTable = Mockito.mock(TableIf.class);
+ HMSExternalTable hmsExternalTable =
Mockito.mock(HMSExternalTable.class);
+ DatabaseIf<TableIf> database = mockDatabase();
+ CatalogIf<?> catalog = mockCatalog();
+ MvccSnapshot mvccSnapshot = Mockito.mock(MvccSnapshot.class);
+ SessionVariable sessionVariable = new SessionVariable();
+ sessionVariable.setEnablePreloadExternalMetadata(true);
+
+
Mockito.when(connectContext.getSessionVariable()).thenReturn(sessionVariable);
+
Mockito.when(connectContext.getQueryIdentifier()).thenReturn("query-1");
+ Mockito.when(internalTable.needReadLockWhenPlan()).thenReturn(true);
+ Mockito.when(hmsExternalTable.getId()).thenReturn(10L);
+ Mockito.when(hmsExternalTable.getName()).thenReturn("hudi_tbl");
+ Mockito.when(hmsExternalTable.getDatabase()).thenReturn(database);
+ Mockito.when(database.getFullName()).thenReturn("db");
+ Mockito.when(database.getCatalog()).thenReturn(catalog);
+ Mockito.when(catalog.getName()).thenReturn("ctl");
+
Mockito.when(hmsExternalTable.supportsExternalMetadataPreload()).thenReturn(true);
+
Mockito.when(hmsExternalTable.supportsLatestSnapshotPreload()).thenReturn(true);
+ Mockito.when(hmsExternalTable.getDlaType()).thenReturn(DLAType.HUDI);
+
Mockito.when(hmsExternalTable.loadSnapshot(Mockito.<Optional<TableSnapshot>>any(),
Mockito.any()))
+ .thenReturn(mvccSnapshot);
+
Mockito.when(hmsExternalTable.getBaseSchema()).thenReturn(Collections.emptyList());
+
Mockito.when(hmsExternalTable.supportInternalPartitionPruned()).thenReturn(true);
+
Mockito.when(hmsExternalTable.initSelectedPartitions(Mockito.any())).thenReturn(SelectedPartitions.NOT_PRUNED);
+
+ StatementContext statementContext = new
StatementContext(connectContext, new OriginStatement("select 1", 0));
+ try {
+ statementContext.getTables().put(ImmutableList.of("ctl", "db",
"internal"), internalTable);
+ statementContext.registerExternalTableForPreload(hmsExternalTable,
Optional.empty(), Optional.empty());
+
+ ExternalMetadataPreloadResult result =
executePreload(statementContext);
+
+ org.junit.jupiter.api.Assertions.assertTrue(result.isExecuted());
+ org.junit.jupiter.api.Assertions.assertEquals(1,
result.getCandidateTableCount());
+ org.junit.jupiter.api.Assertions.assertEquals(1,
result.getPreloadedTableCount());
+ Mockito.verify(hmsExternalTable, Mockito.times(1))
+ .loadSnapshot(Mockito.<Optional<TableSnapshot>>any(),
Mockito.any());
+ Mockito.verify(hmsExternalTable, Mockito.times(1)).getBaseSchema();
+ Mockito.verify(hmsExternalTable,
Mockito.times(1)).initSelectedPartitions(Mockito.any());
+ } finally {
+ statementContext.close();
+ }
+ }
+
+ @Test
+ public void testPreloadHiveSchemaAndPartitionsBeforeLock() {
+ ConnectContext connectContext = Mockito.mock(ConnectContext.class);
+ TableIf internalTable = Mockito.mock(TableIf.class);
+ HMSExternalTable hmsExternalTable =
Mockito.mock(HMSExternalTable.class);
+ DatabaseIf<TableIf> database = mockDatabase();
+ CatalogIf<?> catalog = mockCatalog();
+ SessionVariable sessionVariable = new SessionVariable();
+ sessionVariable.setEnablePreloadExternalMetadata(true);
+
+
Mockito.when(connectContext.getSessionVariable()).thenReturn(sessionVariable);
+
Mockito.when(connectContext.getQueryIdentifier()).thenReturn("query-hive");
+ Mockito.when(internalTable.needReadLockWhenPlan()).thenReturn(true);
+ Mockito.when(hmsExternalTable.getId()).thenReturn(19L);
+ Mockito.when(hmsExternalTable.getName()).thenReturn("hive_tbl");
+ Mockito.when(hmsExternalTable.getDatabase()).thenReturn(database);
+ Mockito.when(database.getFullName()).thenReturn("db");
+ Mockito.when(database.getCatalog()).thenReturn(catalog);
+ Mockito.when(catalog.getName()).thenReturn("ctl");
+
Mockito.when(hmsExternalTable.supportsExternalMetadataPreload()).thenReturn(true);
+
Mockito.when(hmsExternalTable.supportsLatestSnapshotPreload()).thenReturn(false);
+ Mockito.when(hmsExternalTable.getDlaType()).thenReturn(DLAType.HIVE);
+
Mockito.when(hmsExternalTable.getBaseSchema()).thenReturn(Collections.emptyList());
+
Mockito.when(hmsExternalTable.supportInternalPartitionPruned()).thenReturn(true);
+
Mockito.when(hmsExternalTable.initSelectedPartitions(Mockito.any())).thenReturn(SelectedPartitions.NOT_PRUNED);
+
+ StatementContext statementContext = new
StatementContext(connectContext, new OriginStatement("select 1", 0));
+ try {
+ statementContext.getTables().put(ImmutableList.of("ctl", "db",
"internal"), internalTable);
+ statementContext.registerExternalTableForPreload(hmsExternalTable,
Optional.empty(), Optional.empty());
+
+ ExternalMetadataPreloadResult result =
executePreload(statementContext);
+
+ org.junit.jupiter.api.Assertions.assertTrue(result.isExecuted());
+ org.junit.jupiter.api.Assertions.assertEquals(1,
result.getCandidateTableCount());
+ org.junit.jupiter.api.Assertions.assertEquals(1,
result.getPreloadedTableCount());
+ Mockito.verify(hmsExternalTable, Mockito.never())
+ .loadSnapshot(Mockito.<Optional<TableSnapshot>>any(),
Mockito.any());
+ Mockito.verify(hmsExternalTable, Mockito.times(1)).getBaseSchema();
+ Mockito.verify(hmsExternalTable,
Mockito.times(1)).initSelectedPartitions(Mockito.any());
+ } finally {
+ statementContext.close();
+ }
+ }
+
+ @Test
+ public void testSkipPreloadWhenSessionVariableDisabled() {
+ ConnectContext connectContext = Mockito.mock(ConnectContext.class);
+ TableIf internalTable = Mockito.mock(TableIf.class);
+ HMSExternalTable hmsExternalTable =
Mockito.mock(HMSExternalTable.class);
+ SessionVariable sessionVariable = new SessionVariable();
+
+
Mockito.when(connectContext.getSessionVariable()).thenReturn(sessionVariable);
+ Mockito.when(hmsExternalTable.getId()).thenReturn(11L);
+
Mockito.when(hmsExternalTable.supportsExternalMetadataPreload()).thenReturn(true);
+
+ StatementContext statementContext = new
StatementContext(connectContext, new OriginStatement("select 1", 0));
+ try {
+ statementContext.getTables().put(ImmutableList.of("ctl", "db",
"internal"), internalTable);
+ statementContext.registerExternalTableForPreload(hmsExternalTable,
Optional.empty(), Optional.empty());
+
+ ExternalMetadataPreloadResult result =
executePreload(statementContext);
+
+ org.junit.jupiter.api.Assertions.assertFalse(result.isExecuted());
+ org.junit.jupiter.api.Assertions.assertEquals(1,
result.getCandidateTableCount());
+ org.junit.jupiter.api.Assertions.assertEquals(0,
result.getPreloadedTableCount());
+ org.junit.jupiter.api.Assertions.assertEquals(
+ "session variable enable_preload_external_metadata is
disabled", result.getSkipReason());
+ Mockito.verify(hmsExternalTable, Mockito.never()).getBaseSchema();
+ } finally {
+ statementContext.close();
+ }
+ }
+
+ @Test
+ public void testSkipLatestPreloadWhenExplicitSnapshotExists() {
+ ConnectContext connectContext = Mockito.mock(ConnectContext.class);
+ TableIf internalTable = Mockito.mock(TableIf.class);
+ HMSExternalTable hmsExternalTable =
Mockito.mock(HMSExternalTable.class);
+ DatabaseIf<TableIf> database = mockDatabase();
+ CatalogIf<?> catalog = mockCatalog();
+ SessionVariable sessionVariable = new SessionVariable();
+ sessionVariable.setEnablePreloadExternalMetadata(true);
+
+
Mockito.when(connectContext.getSessionVariable()).thenReturn(sessionVariable);
+
Mockito.when(connectContext.getQueryIdentifier()).thenReturn("query-2");
+ Mockito.when(internalTable.needReadLockWhenPlan()).thenReturn(true);
+ Mockito.when(hmsExternalTable.getId()).thenReturn(12L);
+ Mockito.when(hmsExternalTable.getName()).thenReturn("hudi_tbl");
+ Mockito.when(hmsExternalTable.getDatabase()).thenReturn(database);
+ Mockito.when(database.getFullName()).thenReturn("db");
+ Mockito.when(database.getCatalog()).thenReturn(catalog);
+ Mockito.when(catalog.getName()).thenReturn("ctl");
+
Mockito.when(hmsExternalTable.supportsExternalMetadataPreload()).thenReturn(true);
+
Mockito.when(hmsExternalTable.supportsLatestSnapshotPreload()).thenReturn(true);
+ Mockito.when(hmsExternalTable.getDlaType()).thenReturn(DLAType.HUDI);
+
+ StatementContext statementContext = new
StatementContext(connectContext, new OriginStatement("select 1", 0));
+ try {
+ statementContext.getTables().put(ImmutableList.of("ctl", "db",
"internal"), internalTable);
+ statementContext.registerExternalTableForPreload(hmsExternalTable,
Optional.empty(), Optional.empty());
+ statementContext.registerExternalTableForPreload(hmsExternalTable,
+ Optional.of(new TableSnapshot("2024-01-01 00:00:00",
TableSnapshot.VersionType.TIME)),
+ Optional.empty());
+
+ ExternalMetadataPreloadResult result =
executePreload(statementContext);
+
+ org.junit.jupiter.api.Assertions.assertTrue(result.isExecuted());
+ org.junit.jupiter.api.Assertions.assertEquals(1,
result.getCandidateTableCount());
+ org.junit.jupiter.api.Assertions.assertEquals(0,
result.getPreloadedTableCount());
+ Mockito.verify(hmsExternalTable, Mockito.never())
+ .loadSnapshot(Mockito.<Optional<TableSnapshot>>any(),
Mockito.any());
+ Mockito.verify(hmsExternalTable, Mockito.never()).getBaseSchema();
+ Mockito.verify(hmsExternalTable,
Mockito.never()).initSelectedPartitions(Mockito.any());
+ } finally {
+ statementContext.close();
+ }
+ }
+
+ @Test
+ public void testPreloadHmsIcebergLatestSnapshotBeforeLock() {
+ ConnectContext connectContext = Mockito.mock(ConnectContext.class);
+ TableIf internalTable = Mockito.mock(TableIf.class);
+ HMSExternalTable hmsExternalTable =
Mockito.mock(HMSExternalTable.class);
+ DatabaseIf<TableIf> database = mockDatabase();
+ CatalogIf<?> catalog = mockCatalog();
+ MvccSnapshot mvccSnapshot = Mockito.mock(MvccSnapshot.class);
+ SessionVariable sessionVariable = new SessionVariable();
+ sessionVariable.setEnablePreloadExternalMetadata(true);
+
+
Mockito.when(connectContext.getSessionVariable()).thenReturn(sessionVariable);
+ Mockito.when(internalTable.needReadLockWhenPlan()).thenReturn(true);
+ Mockito.when(hmsExternalTable.getId()).thenReturn(14L);
+ Mockito.when(hmsExternalTable.getName()).thenReturn("hms_iceberg_tbl");
+ Mockito.when(hmsExternalTable.getDatabase()).thenReturn(database);
+ Mockito.when(database.getFullName()).thenReturn("db");
+ Mockito.when(database.getCatalog()).thenReturn(catalog);
+ Mockito.when(catalog.getName()).thenReturn("ctl");
+
Mockito.when(hmsExternalTable.supportsExternalMetadataPreload()).thenReturn(true);
+
Mockito.doCallRealMethod().when(hmsExternalTable).supportsLatestSnapshotPreload();
+
Mockito.when(hmsExternalTable.getDlaType()).thenReturn(DLAType.ICEBERG);
+
Mockito.when(hmsExternalTable.loadSnapshot(Mockito.<Optional<TableSnapshot>>any(),
Mockito.any()))
+ .thenReturn(mvccSnapshot);
+
Mockito.when(hmsExternalTable.getBaseSchema()).thenReturn(Collections.emptyList());
+
Mockito.when(hmsExternalTable.supportInternalPartitionPruned()).thenReturn(false);
+
+ StatementContext statementContext = new
StatementContext(connectContext, new OriginStatement("select 1", 0));
+ try {
+ statementContext.getTables().put(ImmutableList.of("ctl", "db",
"internal"), internalTable);
+ statementContext.registerExternalTableForPreload(hmsExternalTable,
Optional.empty(), Optional.empty());
+
+ ExternalMetadataPreloadResult result =
executePreload(statementContext);
+
+ org.junit.jupiter.api.Assertions.assertTrue(result.isExecuted());
+ org.junit.jupiter.api.Assertions.assertEquals(1,
result.getCandidateTableCount());
+ org.junit.jupiter.api.Assertions.assertEquals(1,
result.getPreloadedTableCount());
+ Mockito.verify(hmsExternalTable, Mockito.times(1))
+ .loadSnapshot(Mockito.<Optional<TableSnapshot>>any(),
Mockito.any());
+ Mockito.verify(hmsExternalTable, Mockito.times(1)).getBaseSchema();
+ Mockito.verify(hmsExternalTable,
Mockito.never()).initSelectedPartitions(Mockito.any());
+ } finally {
+ statementContext.close();
+ }
+ }
+
+ @Test
+ public void testSkipHmsIcebergPreloadWhenOnlyNonLatestRelationExists() {
+ ConnectContext connectContext = Mockito.mock(ConnectContext.class);
+ TableIf internalTable = Mockito.mock(TableIf.class);
+ HMSExternalTable hmsExternalTable =
Mockito.mock(HMSExternalTable.class);
+ DatabaseIf<TableIf> database = mockDatabase();
+ CatalogIf<?> catalog = mockCatalog();
+ SessionVariable sessionVariable = new SessionVariable();
+ sessionVariable.setEnablePreloadExternalMetadata(true);
+
+
Mockito.when(connectContext.getSessionVariable()).thenReturn(sessionVariable);
+ Mockito.when(internalTable.needReadLockWhenPlan()).thenReturn(true);
+ Mockito.when(hmsExternalTable.getId()).thenReturn(15L);
+ Mockito.when(hmsExternalTable.getName()).thenReturn("hms_iceberg_tbl");
+ Mockito.when(hmsExternalTable.getDatabase()).thenReturn(database);
+ Mockito.when(database.getFullName()).thenReturn("db");
+ Mockito.when(database.getCatalog()).thenReturn(catalog);
+ Mockito.when(catalog.getName()).thenReturn("ctl");
+
Mockito.when(hmsExternalTable.supportsExternalMetadataPreload()).thenReturn(true);
+
Mockito.doCallRealMethod().when(hmsExternalTable).supportsLatestSnapshotPreload();
+
Mockito.when(hmsExternalTable.getDlaType()).thenReturn(DLAType.ICEBERG);
+
Mockito.when(hmsExternalTable.supportInternalPartitionPruned()).thenReturn(false);
+
+ StatementContext statementContext = new
StatementContext(connectContext, new OriginStatement("select 1", 0));
+ try {
+ statementContext.getTables().put(ImmutableList.of("ctl", "db",
"internal"), internalTable);
+ statementContext.registerExternalTableForPreload(hmsExternalTable,
+ Optional.of(new TableSnapshot("2024-01-01 00:00:00",
TableSnapshot.VersionType.TIME)),
+ Optional.empty());
+
+ ExternalMetadataPreloadResult result =
executePreload(statementContext);
+
+ org.junit.jupiter.api.Assertions.assertTrue(result.isExecuted());
+ org.junit.jupiter.api.Assertions.assertEquals(1,
result.getCandidateTableCount());
+ org.junit.jupiter.api.Assertions.assertEquals(0,
result.getPreloadedTableCount());
+ Mockito.verify(hmsExternalTable, Mockito.never())
+ .loadSnapshot(Mockito.<Optional<TableSnapshot>>any(),
Mockito.any());
+ Mockito.verify(hmsExternalTable, Mockito.never()).getBaseSchema();
+ Mockito.verify(hmsExternalTable,
Mockito.never()).initSelectedPartitions(Mockito.any());
+ } finally {
+ statementContext.close();
+ }
+ }
+
+ @Test
+ public void testPreloadJdbcExternalTablesBeforeLock() {
+ ConnectContext connectContext = Mockito.mock(ConnectContext.class);
+ TableIf internalTable = Mockito.mock(TableIf.class);
+ JdbcExternalTable jdbcExternalTable =
Mockito.mock(JdbcExternalTable.class);
+ SessionVariable sessionVariable = new SessionVariable();
+ sessionVariable.setEnablePreloadExternalMetadata(true);
+
+
Mockito.when(connectContext.getSessionVariable()).thenReturn(sessionVariable);
+
Mockito.when(connectContext.getQueryIdentifier()).thenReturn("query-3");
+ Mockito.when(internalTable.needReadLockWhenPlan()).thenReturn(true);
+ Mockito.when(jdbcExternalTable.getId()).thenReturn(13L);
+
Mockito.when(jdbcExternalTable.supportsExternalMetadataPreload()).thenReturn(true);
+
Mockito.when(jdbcExternalTable.getBaseSchema()).thenReturn(Collections.emptyList());
+
Mockito.when(jdbcExternalTable.supportInternalPartitionPruned()).thenReturn(false);
+
+ StatementContext statementContext = new
StatementContext(connectContext, new OriginStatement("select 1", 0));
+ try {
+ statementContext.getTables().put(ImmutableList.of("ctl", "db",
"internal"), internalTable);
+
statementContext.registerExternalTableForPreload(jdbcExternalTable,
Optional.empty(), Optional.empty());
+
+ ExternalMetadataPreloadResult result =
executePreload(statementContext);
+
+ org.junit.jupiter.api.Assertions.assertTrue(result.isExecuted());
+ org.junit.jupiter.api.Assertions.assertEquals(1,
result.getCandidateTableCount());
+ org.junit.jupiter.api.Assertions.assertEquals(1,
result.getPreloadedTableCount());
+ Mockito.verify(jdbcExternalTable,
Mockito.times(1)).getBaseSchema();
+ Mockito.verify(jdbcExternalTable,
Mockito.never()).initSelectedPartitions(Mockito.any());
+ } finally {
+ statementContext.close();
+ }
+ }
+
+ @Test
+ public void testSkipPreloadForUnsupportedExternalTable() {
+ ConnectContext connectContext = Mockito.mock(ConnectContext.class);
+ TableIf internalTable = Mockito.mock(TableIf.class);
+ ExternalTable unsupportedExternalTable =
Mockito.mock(ExternalTable.class);
+ SessionVariable sessionVariable = new SessionVariable();
+ sessionVariable.setEnablePreloadExternalMetadata(true);
+
+
Mockito.when(connectContext.getSessionVariable()).thenReturn(sessionVariable);
+ Mockito.when(internalTable.needReadLockWhenPlan()).thenReturn(true);
+
Mockito.when(unsupportedExternalTable.supportsExternalMetadataPreload()).thenReturn(false);
+
+ StatementContext statementContext = new
StatementContext(connectContext, new OriginStatement("select 1", 0));
+ try {
+ statementContext.getTables().put(ImmutableList.of("ctl", "db",
"internal"), internalTable);
+
statementContext.registerExternalTableForPreload(unsupportedExternalTable,
Optional.empty(), Optional.empty());
+
+ ExternalMetadataPreloadResult result =
executePreload(statementContext);
+
+ org.junit.jupiter.api.Assertions.assertFalse(result.isExecuted());
+ org.junit.jupiter.api.Assertions.assertEquals(0,
result.getCandidateTableCount());
+ org.junit.jupiter.api.Assertions.assertEquals(0,
result.getPreloadedTableCount());
+ org.junit.jupiter.api.Assertions.assertEquals(
+ "no external preload candidates were collected",
result.getSkipReason());
+ Mockito.verify(unsupportedExternalTable,
Mockito.never()).getBaseSchema();
+ } finally {
+ statementContext.close();
+ }
+ }
+
+ @Test
+ public void testSkipPreloadWhenNoInternalTableNeedsPlanReadLock() {
+ ConnectContext connectContext = Mockito.mock(ConnectContext.class);
+ TableIf internalTable = Mockito.mock(TableIf.class);
+ HMSExternalTable hmsExternalTable =
Mockito.mock(HMSExternalTable.class);
+ SessionVariable sessionVariable = new SessionVariable();
+ sessionVariable.setEnablePreloadExternalMetadata(true);
+
+
Mockito.when(connectContext.getSessionVariable()).thenReturn(sessionVariable);
+ Mockito.when(internalTable.needReadLockWhenPlan()).thenReturn(false);
+ Mockito.when(hmsExternalTable.getId()).thenReturn(15L);
+
Mockito.when(hmsExternalTable.supportsExternalMetadataPreload()).thenReturn(true);
+
+ StatementContext statementContext = new
StatementContext(connectContext, new OriginStatement("select 1", 0));
+ try {
+ statementContext.getTables().put(ImmutableList.of("ctl", "db",
"internal"), internalTable);
+ statementContext.registerExternalTableForPreload(hmsExternalTable,
Optional.empty(), Optional.empty());
+
+ ExternalMetadataPreloadResult result =
executePreload(statementContext);
+
+ org.junit.jupiter.api.Assertions.assertFalse(result.isExecuted());
+ org.junit.jupiter.api.Assertions.assertEquals(1,
result.getCandidateTableCount());
+ org.junit.jupiter.api.Assertions.assertEquals(0,
result.getPreloadedTableCount());
+ org.junit.jupiter.api.Assertions.assertEquals(
+ "no internal tables require plan-time read lock",
result.getSkipReason());
+ Mockito.verify(hmsExternalTable, Mockito.never()).getBaseSchema();
+ } finally {
+ statementContext.close();
+ }
+ }
+
+ @Test
+ public void testPreloadIcebergLatestSnapshotBeforeLock() {
+ ConnectContext connectContext = Mockito.mock(ConnectContext.class);
+ TableIf internalTable = Mockito.mock(TableIf.class);
+ IcebergExternalTable icebergExternalTable =
Mockito.mock(IcebergExternalTable.class);
+ DatabaseIf<TableIf> database = mockDatabase();
+ CatalogIf<?> catalog = mockCatalog();
+ MvccSnapshot mvccSnapshot = Mockito.mock(MvccSnapshot.class);
+ SessionVariable sessionVariable = new SessionVariable();
+ sessionVariable.setEnablePreloadExternalMetadata(true);
+
+
Mockito.when(connectContext.getSessionVariable()).thenReturn(sessionVariable);
+ Mockito.when(internalTable.needReadLockWhenPlan()).thenReturn(true);
+ Mockito.when(icebergExternalTable.getId()).thenReturn(16L);
+ Mockito.when(icebergExternalTable.getName()).thenReturn("iceberg_tbl");
+ Mockito.when(icebergExternalTable.getDatabase()).thenReturn(database);
+ Mockito.when(database.getFullName()).thenReturn("db");
+ Mockito.when(database.getCatalog()).thenReturn(catalog);
+ Mockito.when(catalog.getName()).thenReturn("ctl");
+
Mockito.when(icebergExternalTable.supportsExternalMetadataPreload()).thenReturn(true);
+
Mockito.when(icebergExternalTable.supportsLatestSnapshotPreload()).thenReturn(true);
+
Mockito.when(icebergExternalTable.loadSnapshot(Mockito.<Optional<TableSnapshot>>any(),
Mockito.any()))
+ .thenReturn(mvccSnapshot);
+
Mockito.when(icebergExternalTable.getBaseSchema()).thenReturn(Collections.emptyList());
+
Mockito.when(icebergExternalTable.supportInternalPartitionPruned()).thenReturn(false);
+
+ StatementContext statementContext = new
StatementContext(connectContext, new OriginStatement("select 1", 0));
+ try {
+ statementContext.getTables().put(ImmutableList.of("ctl", "db",
"internal"), internalTable);
+
statementContext.registerExternalTableForPreload(icebergExternalTable,
Optional.empty(), Optional.empty());
+
+ ExternalMetadataPreloadResult result =
executePreload(statementContext);
+
+ org.junit.jupiter.api.Assertions.assertTrue(result.isExecuted());
+ org.junit.jupiter.api.Assertions.assertEquals(1,
result.getCandidateTableCount());
+ org.junit.jupiter.api.Assertions.assertEquals(1,
result.getPreloadedTableCount());
+ Mockito.verify(icebergExternalTable, Mockito.times(1))
+ .loadSnapshot(Mockito.<Optional<TableSnapshot>>any(),
Mockito.any());
+ Mockito.verify(icebergExternalTable,
Mockito.times(1)).getBaseSchema();
+ } finally {
+ statementContext.close();
+ }
+ }
+
+ @Test
+ public void testSkipIcebergPreloadWhenOnlyNonLatestRelationExists() {
+ ConnectContext connectContext = Mockito.mock(ConnectContext.class);
+ TableIf internalTable = Mockito.mock(TableIf.class);
+ IcebergExternalTable icebergExternalTable =
Mockito.mock(IcebergExternalTable.class);
+ DatabaseIf<TableIf> database = mockDatabase();
+ CatalogIf<?> catalog = mockCatalog();
+ SessionVariable sessionVariable = new SessionVariable();
+ sessionVariable.setEnablePreloadExternalMetadata(true);
+
+
Mockito.when(connectContext.getSessionVariable()).thenReturn(sessionVariable);
+ Mockito.when(internalTable.needReadLockWhenPlan()).thenReturn(true);
+ Mockito.when(icebergExternalTable.getId()).thenReturn(18L);
+ Mockito.when(icebergExternalTable.getName()).thenReturn("iceberg_tbl");
+ Mockito.when(icebergExternalTable.getDatabase()).thenReturn(database);
+ Mockito.when(database.getFullName()).thenReturn("db");
+ Mockito.when(database.getCatalog()).thenReturn(catalog);
+ Mockito.when(catalog.getName()).thenReturn("ctl");
+
Mockito.when(icebergExternalTable.supportsExternalMetadataPreload()).thenReturn(true);
+
Mockito.when(icebergExternalTable.supportsLatestSnapshotPreload()).thenReturn(true);
+
Mockito.when(icebergExternalTable.supportInternalPartitionPruned()).thenReturn(true);
+
+ StatementContext statementContext = new
StatementContext(connectContext, new OriginStatement("select 1", 0));
+ try {
+ statementContext.getTables().put(ImmutableList.of("ctl", "db",
"internal"), internalTable);
+
statementContext.registerExternalTableForPreload(icebergExternalTable,
+ Optional.of(new TableSnapshot("2024-01-01 00:00:00",
TableSnapshot.VersionType.TIME)),
+ Optional.empty());
+
+ ExternalMetadataPreloadResult result =
executePreload(statementContext);
+
+ org.junit.jupiter.api.Assertions.assertTrue(result.isExecuted());
+ org.junit.jupiter.api.Assertions.assertEquals(1,
result.getCandidateTableCount());
+ org.junit.jupiter.api.Assertions.assertEquals(0,
result.getPreloadedTableCount());
+ Mockito.verify(icebergExternalTable, Mockito.never())
+ .loadSnapshot(Mockito.<Optional<TableSnapshot>>any(),
Mockito.any());
+ Mockito.verify(icebergExternalTable,
Mockito.never()).getBaseSchema();
+ Mockito.verify(icebergExternalTable,
Mockito.never()).initSelectedPartitions(Mockito.any());
+ } finally {
+ statementContext.close();
+ }
+ }
+
+ @Test
+ public void testPreloadPaimonLatestSnapshotBeforeLock() {
+ ConnectContext connectContext = Mockito.mock(ConnectContext.class);
+ TableIf internalTable = Mockito.mock(TableIf.class);
+ PaimonExternalTable paimonExternalTable =
Mockito.mock(PaimonExternalTable.class);
+ DatabaseIf<TableIf> database = mockDatabase();
+ CatalogIf<?> catalog = mockCatalog();
+ MvccSnapshot mvccSnapshot = Mockito.mock(MvccSnapshot.class);
+ SessionVariable sessionVariable = new SessionVariable();
+ sessionVariable.setEnablePreloadExternalMetadata(true);
+
+
Mockito.when(connectContext.getSessionVariable()).thenReturn(sessionVariable);
+ Mockito.when(internalTable.needReadLockWhenPlan()).thenReturn(true);
+ Mockito.when(paimonExternalTable.getId()).thenReturn(17L);
+ Mockito.when(paimonExternalTable.getName()).thenReturn("paimon_tbl");
+ Mockito.when(paimonExternalTable.getDatabase()).thenReturn(database);
+ Mockito.when(database.getFullName()).thenReturn("db");
+ Mockito.when(database.getCatalog()).thenReturn(catalog);
+ Mockito.when(catalog.getName()).thenReturn("ctl");
+
Mockito.when(paimonExternalTable.supportsExternalMetadataPreload()).thenReturn(true);
+
Mockito.when(paimonExternalTable.supportsLatestSnapshotPreload()).thenReturn(true);
+
Mockito.when(paimonExternalTable.loadSnapshot(Mockito.<Optional<TableSnapshot>>any(),
Mockito.any()))
+ .thenReturn(mvccSnapshot);
+
Mockito.when(paimonExternalTable.getBaseSchema()).thenReturn(Collections.emptyList());
+
Mockito.when(paimonExternalTable.supportInternalPartitionPruned()).thenReturn(true);
+
Mockito.when(paimonExternalTable.initSelectedPartitions(Mockito.any())).thenReturn(SelectedPartitions.NOT_PRUNED);
+
+ StatementContext statementContext = new
StatementContext(connectContext, new OriginStatement("select 1", 0));
+ try {
+ statementContext.getTables().put(ImmutableList.of("ctl", "db",
"internal"), internalTable);
+
statementContext.registerExternalTableForPreload(paimonExternalTable,
Optional.empty(), Optional.empty());
+
+ ExternalMetadataPreloadResult result =
executePreload(statementContext);
+
+ org.junit.jupiter.api.Assertions.assertTrue(result.isExecuted());
+ org.junit.jupiter.api.Assertions.assertEquals(1,
result.getCandidateTableCount());
+ org.junit.jupiter.api.Assertions.assertEquals(1,
result.getPreloadedTableCount());
+ InOrder inOrder = Mockito.inOrder(paimonExternalTable);
+ inOrder.verify(paimonExternalTable, Mockito.times(1))
+ .loadSnapshot(Mockito.<Optional<TableSnapshot>>any(),
Mockito.any());
+ inOrder.verify(paimonExternalTable,
Mockito.times(1)).getBaseSchema();
+ inOrder.verify(paimonExternalTable,
Mockito.times(1)).initSelectedPartitions(Mockito.any());
+ } finally {
+ statementContext.close();
+ }
+ }
+
+ @SuppressWarnings("unchecked")
+ private DatabaseIf<TableIf> mockDatabase() {
+ return Mockito.mock(DatabaseIf.class);
+ }
+
+ private CatalogIf<?> mockCatalog() {
+ return Mockito.mock(CatalogIf.class);
+ }
+
+ private ExternalMetadataPreloadResult executePreload(StatementContext
statementContext) {
+ ExternalMetadataPreloadResult result = new
PreloadExternalMetadata().executePreload(statementContext);
+ statementContext.setExternalMetadataPreloadResult(result);
+ return result;
+ }
+}
diff --git
a/fe/fe-core/src/test/java/org/apache/doris/qe/SessionVariablesTest.java
b/fe/fe-core/src/test/java/org/apache/doris/qe/SessionVariablesTest.java
index 2610a8b8b48..e8f7d7c0ed2 100644
--- a/fe/fe-core/src/test/java/org/apache/doris/qe/SessionVariablesTest.java
+++ b/fe/fe-core/src/test/java/org/apache/doris/qe/SessionVariablesTest.java
@@ -17,7 +17,11 @@
package org.apache.doris.qe;
+import org.apache.doris.analysis.SetType;
+import org.apache.doris.analysis.SetVar;
+import org.apache.doris.analysis.StringLiteral;
import org.apache.doris.common.Config;
+import org.apache.doris.common.DdlException;
import org.apache.doris.common.FeConstants;
import org.apache.doris.nereids.parser.NereidsParser;
import org.apache.doris.utframe.TestWithFeService;
@@ -172,4 +176,16 @@ public class SessionVariablesTest extends
TestWithFeService {
Assertions.assertTrue(sv.isMorValuePredicatePushdownEnabled("db1",
"tbl1"));
Assertions.assertFalse(sv.isMorValuePredicatePushdownEnabled("db2",
"tbl1"));
}
+
+ @Test
+ public void testEnablePreloadExternalMetadata() throws DdlException {
+
Assertions.assertFalse(sessionVariable.isEnablePreloadExternalMetadata());
+
+ // Verify the new preload switch can be changed through the standard
session variable path.
+ VariableMgr.setVar(sessionVariable, new SetVar(SetType.SESSION,
+ SessionVariable.ENABLE_PRELOAD_EXTERNAL_METADATA,
+ new StringLiteral("true")));
+
+
Assertions.assertTrue(sessionVariable.isEnablePreloadExternalMetadata());
+ }
}
diff --git
a/regression-test/suites/external_table_p0/jdbc/test_preload_external_metadata_profile.groovy
b/regression-test/suites/external_table_p0/jdbc/test_preload_external_metadata_profile.groovy
new file mode 100644
index 00000000000..5bcebb7da9e
--- /dev/null
+++
b/regression-test/suites/external_table_p0/jdbc/test_preload_external_metadata_profile.groovy
@@ -0,0 +1,195 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+import java.util.regex.Pattern
+
+suite("test_preload_external_metadata_profile",
"p0,external,doris,external_docker,external_docker_doris") {
+ String dbName = "regression_test_preload_external_metadata_profile"
+ String catalogOff = "preload_external_metadata_profile_off"
+ String catalogOn = "preload_external_metadata_profile_on"
+ String internalTable = "preload_profile_internal"
+ String externalTablePrefix = "preload_profile_jdbc"
+ int externalTableCount = 10
+ int extraColumnCount = 60
+ int randomSuffix = Math.random() * 2000000000
+
+ String jdbcUrl = context.config.jdbcUrl
+ String jdbcUser = context.config.jdbcUser
+ String jdbcPassword = context.config.jdbcPassword
+ String s3Endpoint = getS3Endpoint()
+ String bucket = getS3BucketName()
+ String driverUrl =
"https://${bucket}.${s3Endpoint}/regression/jdbc_driver/mysql-connector-j-8.4.0.jar"
+ Map<String, Map<String, Integer>> profileMetrics = [:]
+
+ def parseTimeMs = { String timeText ->
+ if (timeText == "0") {
+ return 0
+ }
+ int totalMs = 0
+ def matcher =
Pattern.compile("(\\d+)(hour|min|sec|ms)").matcher(timeText)
+ while (matcher.find()) {
+ int value = Integer.parseInt(matcher.group(1))
+ String unit = matcher.group(2)
+ if (unit == "hour") {
+ totalMs += value * 60 * 60 * 1000
+ } else if (unit == "min") {
+ totalMs += value * 60 * 1000
+ } else if (unit == "sec") {
+ totalMs += value * 1000
+ } else if (unit == "ms") {
+ totalMs += value
+ }
+ }
+ if (totalMs == 0) {
+ fail("Could not parse time counter: ${timeText}")
+ }
+ return totalMs
+ }
+
+ def extractCounterMs = { String profileString, String counterName ->
+ String flexibleCounterName = counterName.split(" ").collect {
Pattern.quote(it) }.join("\\s+")
+ Pattern pattern = Pattern.compile(flexibleCounterName +
"\\s*:\\s*([0-9]+(?:hour|min|sec|ms)?(?:[0-9]+(?:min|sec|ms))*)")
+ def matcher = pattern.matcher(profileString)
+ if (!matcher.find()) {
+ fail("Could not find profile counter:
${counterName}\n${profileString}")
+ }
+ return parseTimeMs(matcher.group(1))
+ }
+
+ def buildJdbcCatalog = { String catalogName ->
+ sql """ DROP CATALOG IF EXISTS ${catalogName} """
+ sql """
+ CREATE CATALOG `${catalogName}` PROPERTIES (
+ "user" = "${jdbcUser}",
+ "type" = "jdbc",
+ "password" = "${jdbcPassword}",
+ "jdbc_url" = "${jdbcUrl}",
+ "driver_url" = "${driverUrl}",
+ "driver_class" = "com.mysql.cj.jdbc.Driver"
+ )
+ """
+ }
+
+ def buildQuery = { String catalogName ->
+ String selectColumns = (1..externalTableCount).collect { idx ->
+ "j${idx}.c1 AS j${idx}_c1, j${idx}.c${extraColumnCount} AS
j${idx}_c${extraColumnCount}"
+ }.join(",\n")
+ String joins = (1..externalTableCount).collect { idx ->
+ "JOIN ${catalogName}.${dbName}.${externalTablePrefix}_${idx}
j${idx} ON i.id = j${idx}.id"
+ }.join("\n")
+ return """
+ SELECT i.id,
+ ${selectColumns}
+ FROM ${internalTable} i
+ ${joins}
+ ORDER BY i.id
+ """
+ }
+
+ def checkPreloadProfile = { String tag, String catalogName, boolean
enablePreload, boolean expectPreload ->
+ sql """ SET enable_preload_external_metadata = ${enablePreload} """
+ profile(tag) {
+ run {
+ def result = sql """ /* ${tag} */ ${buildQuery(catalogName)}
"""
+ assertEquals(2, result.size())
+ }
+
+ check { profileString, exception ->
+ if (exception != null) {
+ throw exception
+ }
+ assertTrue(profileString.contains("PhysicalJdbcScan"))
+ int preloadMs = extractCounterMs(profileString, "Nereids
Preload External Metadata Time")
+ int lockMs = extractCounterMs(profileString, "Nereids Lock
Table Time")
+ int analysisMs = extractCounterMs(profileString, "Nereids
Analysis Time")
+ profileMetrics[tag] = [
+ preloadMs: preloadMs,
+ lockMs: lockMs,
+ analysisMs: analysisMs
+ ]
+ log.info("preload external metadata profile: tag={},
preloadMs={}, lockMs={}, analysisMs={}",
+ tag, preloadMs, lockMs, analysisMs)
+ if (expectPreload) {
+ assertTrue("preload metadata time should be recorded in
preload counter", preloadMs > 0)
+ } else {
+ assertEquals(0, preloadMs)
+ }
+ assertTrue("lock table counter should be non-negative", lockMs
>= 0)
+ }
+ }
+ }
+
+ try {
+ sql """ SET enable_nereids_planner = true """
+ sql """ SET enable_fallback_to_original_planner = false """
+ sql """ DROP CATALOG IF EXISTS ${catalogOff} """
+ sql """ DROP CATALOG IF EXISTS ${catalogOn} """
+ sql """ DROP DATABASE IF EXISTS ${dbName} FORCE """
+ sql """ CREATE DATABASE ${dbName} """
+ sql """ USE ${dbName} """
+
+ sql """
+ CREATE TABLE ${internalTable} (
+ id INT,
+ v INT
+ )
+ DISTRIBUTED BY HASH(id) BUCKETS 1
+ PROPERTIES("replication_num" = "1")
+ """
+ sql """ INSERT INTO ${internalTable} VALUES (1, 10), (2, 20), (3, 30)
"""
+
+ String extraColumns = (1..extraColumnCount).collect { idx ->
"`c${idx}` VARCHAR(20)" }.join(",\n")
+ String extraValues = (1..extraColumnCount).collect { idx ->
"'v${idx}'" }.join(", ")
+ for (int i = 1; i <= externalTableCount; i++) {
+ sql """
+ CREATE TABLE ${externalTablePrefix}_${i} (
+ id INT,
+ ${extraColumns}
+ )
+ DISTRIBUTED BY HASH(id) BUCKETS 1
+ PROPERTIES("replication_num" = "1")
+ """
+ sql """
+ INSERT INTO ${externalTablePrefix}_${i}
+ VALUES (1, ${extraValues}), (3, ${extraValues})
+ """
+ }
+
+ buildJdbcCatalog(catalogOff)
+ buildJdbcCatalog(catalogOn)
+
+ String offTag = "preload_external_metadata_profile_off_${randomSuffix}"
+ String onTag = "preload_external_metadata_profile_on_${randomSuffix}"
+ checkPreloadProfile(offTag, catalogOff, false, false)
+ checkPreloadProfile(onTag, catalogOn, true, true)
+ Map<String, Integer> offMetrics = profileMetrics[offTag]
+ Map<String, Integer> onMetrics = profileMetrics[onTag]
+ assertTrue(offMetrics != null)
+ assertTrue(onMetrics != null)
+ int analysisDropMs = offMetrics.analysisMs - onMetrics.analysisMs
+ int allowedNegativeJitterMs = Math.max(20, (int) (onMetrics.preloadMs
* 0.1))
+ log.info("preload external metadata profile comparison:
analysisDropMs={}, preloadMs={}, "
+ + "allowedNegativeJitterMs={}", analysisDropMs,
onMetrics.preloadMs, allowedNegativeJitterMs)
+ assertTrue("analysis time should not increase meaningfully when
metadata is preloaded",
+ analysisDropMs + allowedNegativeJitterMs >= 0)
+ } finally {
+ sql """ SET enable_preload_external_metadata = false """
+ sql """ DROP CATALOG IF EXISTS ${catalogOff} """
+ sql """ DROP CATALOG IF EXISTS ${catalogOn} """
+ sql """ DROP DATABASE IF EXISTS ${dbName} FORCE """
+ }
+}
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]