[
https://issues.apache.org/jira/browse/IMPALA-14082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17955144#comment-17955144
]
Quanlong Huang commented on IMPALA-14082:
-----------------------------------------
I think we still need this. I saw a scenario that users refresh multiple
partitions every 5 mins. The partitions might be different in each REFRESH.
When these RELOAD events arrive in the same event batch (e.g. due to a big lag
of event processing), we should merge them into one batch event and process
them together.
> Batch processing RELOAD events on the same table
> ------------------------------------------------
>
> Key: IMPALA-14082
> URL: https://issues.apache.org/jira/browse/IMPALA-14082
> Project: IMPALA
> Issue Type: Improvement
> Components: Catalog
> Reporter: Quanlong Huang
> Assignee: Sai Hemanth Gantasala
> Priority: Major
>
> Currently, only ALTER_PARTITION and INSERT partition events are processed in
> batches. RELOAD events are triggered by REFRESH in other Impala clusters.
> There could be lots of partition level REFRESH on the same table. Processing
> them one by one acquires the table lock multiple times to load individual
> partitions in sequence. This also keeps the table version changing which
> impacts performance of coordinators in local-catalog mode - query planning
> needs retry to handle InconsistentMetadataFetchException due to table version
> changes.
> Batch processing RELOAD events on the same table can load partiitons in
> parallel and also reduce duplicate reloads.
> CC [~hemanth619], [~VenuReddy]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]