[
https://issues.apache.org/jira/browse/IMPALA-14502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18030749#comment-18030749
]
ASF subversion and git services commented on IMPALA-14502:
----------------------------------------------------------
Commit ec31324eb532652346c00b36ec42ca69e39b5d64 in impala's branch
refs/heads/master from stiga-huang
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=ec31324eb ]
IMPALA-14502: Not tracking metrics in IncompleteTable
Tables that are in unloaded state are represented as IncompleteTable.
Table level metrics of them won't be used at all but occupy around 7KB
of memory for each table. This is a significant amount comparing to the
table name strings.
This patch skips initializing these metrics for IncompleteTable to save
memory usage. This reduces the initial memory requirement to launch
catalogd.
To avoid other codes unintentionally add new metrics to IncompleteTable,
overrides all Table methods that use metrics_ to return simple results,
e.g. IncompleteTable.getMedianTableLoadingTime() always returns 0.
IncompleteTable.getMetrics() shouldn't be used. Added a Precondition
check for this.
Tests:
- Verified in a heap dump file after loading 1.3M IncompleteTables that
the heap usage reduces to 2GB and only few instances of
com.codahale.metrics.Timer are created. Previously catalogd OOM in a
heap size of 18GB when running global IM, and the number of
com.codahale.metrics.Timer instances is similar to the number of
IncompleteTables.
- Passed CORE tests.
Change-Id: If0fcfeab99bbfbefe618d0abf7f2482a0cc5ef9f
Reviewed-on: http://gerrit.cloudera.org:8080/23547
Reviewed-by: Riza Suminto <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
Reviewed-by: Michael Smith <[email protected]>
> Redundant Metrics in IncompleteTable consuming extra space
> ----------------------------------------------------------
>
> Key: IMPALA-14502
> URL: https://issues.apache.org/jira/browse/IMPALA-14502
> Project: IMPALA
> Issue Type: Bug
> Components: Catalog
> Reporter: Quanlong Huang
> Assignee: Quanlong Huang
> Priority: Critical
> Fix For: Impala 5.0.0
>
> Attachments: Dominator.png, Histogram.png
>
>
> In a catalogd heap dump where all of the tables are unloaded, we found
> IncompleteTable consumes more memory space than just the strings of db/table
> name and table type/comment.
> !Histogram.png|width=676,height=436!
> As shown in the histogram, there are 2.6M instances of IncompleteTable
> consuming around 18GB of the heap space. Each instance takes around 7KB of
> memory.
> Looking into the dominator tree (group by classes) of IncompleteTable
> instances, the majority of the space is consumed by Metrics which will never
> be used for IncompleteTable (see
> [code|https://github.com/apache/impala/blob/ebbc67cf40bd856253d07c649028888d85c772cc/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L4242-L4244]).
> !Dominator.png|width=732,height=618!
> We should ignore initializing these metrics for IncompleteTable.
> {code:java}
> public void initMetrics() {
> metrics_.addTimer(REFRESH_DURATION_METRIC);
> metrics_.addTimer(ALTER_DURATION_METRIC);
> metrics_.addTimer(LOAD_DURATION_METRIC);
> metrics_.addTimer(LOAD_DURATION_STORAGE_METADATA);
> metrics_.addTimer(HMS_LOAD_TBL_SCHEMA);
> metrics_.addTimer(LOAD_DURATION_ALL_COLUMN_STATS);
> metrics_.addCounter(NUMBER_OF_INFLIGHT_EVENTS);
> metrics_.addTimer(TBL_EVENTS_PROCESS_DURATION);
> metrics_.addGauge(LAST_SYNC_EVENT_ID,
> (Gauge<Long>) () -> Long.valueOf(lastSyncedEventId_));
> }{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]