[ 
https://issues.apache.org/jira/browse/IMPALA-13599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17930685#comment-17930685
 ] 

ASF subversion and git services commented on IMPALA-13599:
----------------------------------------------------------

Commit ce9b927d547ac4290275fede4843288bbf97a429 in impala's branch 
refs/heads/master from Sai Hemanth Gantasala
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=ce9b927d5 ]

IMPALA-13599: Reduce the number of interactions with alter_partition()
HMS API

Drop incremental stats and set/unset cached operations in Impala are
calling alter_partition() HMS API as many times as the number of
partitions in the table. This patch reduces these unnecessary
interactions by calling alter_partitions() HMS API to #partitions/500
times regardless of number of partitions, since we have a limit of 500
partitions per HMS RPC in bulkAlterPartitions().

Note: This doesn't change the number of ALTER_PARTITION events
generated by HMS. Once HIVE-27746 is included in the impala build, this
patch further benefits by generating #partitions/500 ALTER_PARTITIONS
events.

Testing:
- Added an end-to-end test to verify that HMS API is called only once.

Change-Id: I2f2f1d9637e8be9c931da0415a17dd0839637e4c
Reviewed-on: http://gerrit.cloudera.org:8080/22197
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Reduce ALTER_PARTITION events fired from Impala
> -----------------------------------------------
>
>                 Key: IMPALA-13599
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13599
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Catalog
>            Reporter: Sai Hemanth Gantasala
>            Assignee: Sai Hemanth Gantasala
>            Priority: Major
>
> There are a couple of APIs in Impala that fire ALTER_PARTITION for each 
> partition.
> {code:java}
> Alter table foo set cached/uncached;
> Drop stats table_name;{code}
> These table-level operations are calling HMS APIs alter_partition() thousands 
> times if there are an equal number of partitions. The side effect is that it 
> leads to thousands of alter partition events.
> We should optimize to call alter_partitions() HMS API only once for these 
> operations and bulk update the partitions. 
> P.S: HIVE-27746 - We can ultimately leverage this to have a single 
> alter_partitions event that can help the event processor catch up with lag 
> quickly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to