[ 
https://issues.apache.org/jira/browse/IMPALA-14082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17955144#comment-17955144
 ] 

Quanlong Huang commented on IMPALA-14082:
-----------------------------------------

I think we still need this. I saw a scenario that users refresh multiple 
partitions every 5 mins. The partitions might be different in each REFRESH. 
When these RELOAD events arrive in the same event batch (e.g. due to a big lag 
of event processing), we should merge them into one batch event and process 
them together.

> Batch processing RELOAD events on the same table
> ------------------------------------------------
>
>                 Key: IMPALA-14082
>                 URL: https://issues.apache.org/jira/browse/IMPALA-14082
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Catalog
>            Reporter: Quanlong Huang
>            Assignee: Sai Hemanth Gantasala
>            Priority: Major
>
> Currently, only ALTER_PARTITION and INSERT partition events are processed in 
> batches. RELOAD events are triggered by REFRESH in other Impala clusters. 
> There could be lots of partition level REFRESH on the same table. Processing 
> them one by one acquires the table lock multiple times to load individual 
> partitions in sequence. This also keeps the table version changing which 
> impacts performance of coordinators in local-catalog mode - query planning 
> needs retry to handle InconsistentMetadataFetchException due to table version 
> changes.
> Batch processing RELOAD events on the same table can load partiitons in 
> parallel and also reduce duplicate reloads.
> CC [~hemanth619], [~VenuReddy] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to