[
https://issues.apache.org/jira/browse/IMPALA-14082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18020932#comment-18020932
]
ASF subversion and git services commented on IMPALA-14082:
----------------------------------------------------------
Commit 46525bcd7c76eb1145a855f3706ece6fff380b8f in impala's branch
refs/heads/master from Sai Hemanth Gantasala
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=46525bcd7 ]
IMPALA-14082: Support batch processing of RELOAD events on same table
Currently, RELOAD events of partitioned table are processed one after
the other. Processing them one by one acquires the table lock multiple
times to load individual partitions in sequence. This also keeps the
table version changing which impacts performance of coordinators in
local-catalog mode - query planning needs retry to handle
InconsistentMetadataFetchException due to table version changes.
This patch handles the batch processing logic RELOAD events on same
table by reusing the exisiting logic of BatchPartitionEvent. This
implementation adds four new methods canBeBatched(),addToBatchEvents(),
getPartitionForBatching(), getBatchEventType()(pre-requisites to reuse
batching logic) to the RELOAD event class.
Testing:
- Added an end-to-end to verify the batching.
Change-Id: Ie3e9a99b666a1c928ac2a136bded1e5420f77dab
Reviewed-on: http://gerrit.cloudera.org:8080/23159
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Batch processing RELOAD events on the same table
> ------------------------------------------------
>
> Key: IMPALA-14082
> URL: https://issues.apache.org/jira/browse/IMPALA-14082
> Project: IMPALA
> Issue Type: Improvement
> Components: Catalog
> Reporter: Quanlong Huang
> Assignee: Sai Hemanth Gantasala
> Priority: Major
>
> Currently, only ALTER_PARTITION and INSERT partition events are processed in
> batches. RELOAD events are triggered by REFRESH in other Impala clusters.
> There could be lots of partition level REFRESH on the same table. Processing
> them one by one acquires the table lock multiple times to load individual
> partitions in sequence. This also keeps the table version changing which
> impacts performance of coordinators in local-catalog mode - query planning
> needs retry to handle InconsistentMetadataFetchException due to table version
> changes.
> Batch processing RELOAD events on the same table can load partiitons in
> parallel and also reduce duplicate reloads.
> CC [~hemanth619], [~VenuReddy]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]