[
https://issues.apache.org/jira/browse/IMPALA-12709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18043101#comment-18043101
]
ASF subversion and git services commented on IMPALA-12709:
----------------------------------------------------------
Commit bff814f0790eedc2ab8040c8ff2b9ad0218c9c69 in impala's branch
refs/heads/master from Sai Hemanth Gantasala
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=bff814f07 ]
IMPALA-14562: Enable Hierarchical event processing by default
IMPALA-12709 Added support for hierarchical metastore event processing.
This commit enables hierarchical event processing by default.
hms_event_polling_interval_s can now be set to decimal value (eg: 0.5)
to support millisecond precision interval. Along with that others
configs can be fine tuned, such as:
num_db_event_executors: To set the number of database level event
executors.
num_table_event_executors_per_db_event_executor: To set the number of
table level event executors within a database event executor.
min_event_processor_idle_ms: To set the minimum time to retain idle db
processors and table processors.
max_outstanding_events_on_executors: To set the limit of maximum
outstanding events to process on event executors.
Testing:
- All the testing required to enable this flag is done in IMPALA-12709
and IMPALA-13801.
Change-Id: Ie9a28f863ef17456817e0a335215450e514b1f5b
Reviewed-on: http://gerrit.cloudera.org:8080/23687
Reviewed-by: <[email protected]>
Reviewed-by: Quanlong Huang <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Hierarchical metastore event processing
> ---------------------------------------
>
> Key: IMPALA-12709
> URL: https://issues.apache.org/jira/browse/IMPALA-12709
> Project: IMPALA
> Issue Type: Improvement
> Components: Catalog
> Reporter: Venugopal Reddy K
> Assignee: Venugopal Reddy K
> Priority: Major
> Labels: catalog-2024
> Fix For: Impala 5.0.0
>
> Attachments: Hierarchical metastore event processing.docx
>
>
> *Current Issue:*
> At present, metastore event processor is single threaded. Notification events
> are processed sequentially with a maximum limit of 1000 events fetched and
> processed in a single batch. Multiple locks are used to address the
> concurrency issues that may arise when catalog DDL operation processing and
> metastore event processing tries to access/update the catalog objects
> concurrently. Waiting for a lock or file metadata loading of a table can slow
> the event processing and can affect the processing of other events following
> it. Those events may not be dependent on the previous event. Altogether it
> takes a very long time to synchronize all the HMS events.
> *Proposal:*
> Existing metastore event processing can be turned into multi-level event
> processing. Idea is to segregate the events based on their dependency,
> maintain the order of events as they occur within the dependency and process
> them independently as much as possible:
> # All the events of a table are processed in the same order they have
> actually occurred.
> # Events of different tables are processed in parallel.
> # When a database is altered, all the events relating to the database(i.e.,
> for all its tables) occurring after the alter db event are processed only
> after the alter database event is processed ensuring the order.
> Have attached an initial proposal design document
> https://docs.google.com/document/d/1KZ-ANko-qn5CYmY13m4OVJXAYjLaS1VP-c64Pumipq8/edit?pli=1#heading=h.qyk8qz8ez37b
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]