[
https://issues.apache.org/jira/browse/IMPALA-13178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Quanlong Huang reassigned IMPALA-13178:
---------------------------------------
Assignee: (was: Quanlong Huang)
> Flush the metadata cache to remote storage instead of just invalidating them
> in full GCs
> ----------------------------------------------------------------------------------------
>
> Key: IMPALA-13178
> URL: https://issues.apache.org/jira/browse/IMPALA-13178
> Project: IMPALA
> Issue Type: Improvement
> Components: Catalog
> Reporter: Quanlong Huang
> Priority: Critical
> Labels: catalog-2024
>
> When invalidate_tables_on_memory_pressure is enabled, catalogd will
> invalidate 10% (configured by invalidate_tables_fraction_on_memory_pressure)
> of the tables if the old gen usage of JVM still exceeds 60% (configured by
> invalidate_tables_gc_old_gen_full_threshold) after a full GC.
> Later if the table is used again, catalogd will try to load its metadata. The
> loading process could also lead to OOM (see IMPALA-13117).
> On the other hand, the metadata might have no changes so it's a waste to
> evict and reload them again. Fetching all the partitions from HMS and file
> listing on the storage are expensive. It'd be better to flush out the
> metadata cache of a table instead of just invalidating it. If there are no
> more invalidates (either implicit ones from HMS event processing or explicit
> ones from user commands) on the table, we can reuse the flushed metadata.
> They can be flushed to the remote storage (e.g. HDFS/Ozone/S3) so catalogd
> has unlimited space to use. We can consider just flushing out the
> encodedFileDescriptors (the file metadata) and incremental stats which are
> usually the majority of the metadata cache. Or use a well-defined format
> (e.g. Iceberg manifest files) so we can incrementally flush the metadata even
> with catalog changes (DDL/DMLs).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]