Quanlong Huang created IMPALA-14062:
---------------------------------------
Summary: AlterPartition spent time in fetching latestEventId even
if EventProcessor is disabled
Key: IMPALA-14062
URL: https://issues.apache.org/jira/browse/IMPALA-14062
Project: IMPALA
Issue Type: Bug
Components: Catalog
Reporter: Quanlong Huang
Assignee: Quanlong Huang
IMPALA-11535 introduces a performance regression when HMS event processing is
disabled, i.e. hms_event_polling_interval_s=0, in
[HdfsTable#loadPartitionsFromMetastore()|https://github.com/apache/impala/blob/eb79fbea2b452f09e0e04edc4be274942423d498/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L1951].
It always fetch to latest event id to update the lastRefreshEventId of the
partitions. When HMS event processing is disabled, lastRefreshEventId won't be
used so no need to update it actually.
Here is an AlterPartition statement
{code:sql}
alter table tbl1 partition(p=0) set tblproperties('key'='myvalue'){code}
The catalog timeline in profile indicates fetch latest event id is the
bottleneck:
{noformat}
Catalog Server Operation: 4s794ms
- Got catalog version read lock: 168.861us (168.861us)
- Got catalog version write lock and table write lock: 312.483us
(143.622us)
- Got Metastore client: 258.655ms (258.342ms)
- Altered 1 partitions in Metastore: 677.772ms (419.117ms)
- Got Metastore client: 678.291ms (519.440us)
- Fetched table from Metastore: 909.802ms (231.510ms)
- Loaded all column stats: 986.145ms (76.343ms)
- Loaded table schema: 993.366ms (7.220ms)
- Got current Metastore event id 48196: 4s579ms (3s585ms) <---
Bottleneck
- Start loading file metadata: 4s579ms (51.936us)
- Loaded file metadata for 1 partitions: 4s582ms (3.417ms)
- Reloaded table metadata: 4s793ms (211.162ms)
- DDL finished: 4s794ms (187.654us){noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]