[
https://issues.apache.org/jira/browse/IMPALA-13599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17930685#comment-17930685
]
ASF subversion and git services commented on IMPALA-13599:
----------------------------------------------------------
Commit ce9b927d547ac4290275fede4843288bbf97a429 in impala's branch
refs/heads/master from Sai Hemanth Gantasala
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=ce9b927d5 ]
IMPALA-13599: Reduce the number of interactions with alter_partition()
HMS API
Drop incremental stats and set/unset cached operations in Impala are
calling alter_partition() HMS API as many times as the number of
partitions in the table. This patch reduces these unnecessary
interactions by calling alter_partitions() HMS API to #partitions/500
times regardless of number of partitions, since we have a limit of 500
partitions per HMS RPC in bulkAlterPartitions().
Note: This doesn't change the number of ALTER_PARTITION events
generated by HMS. Once HIVE-27746 is included in the impala build, this
patch further benefits by generating #partitions/500 ALTER_PARTITIONS
events.
Testing:
- Added an end-to-end test to verify that HMS API is called only once.
Change-Id: I2f2f1d9637e8be9c931da0415a17dd0839637e4c
Reviewed-on: http://gerrit.cloudera.org:8080/22197
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Reduce ALTER_PARTITION events fired from Impala
> -----------------------------------------------
>
> Key: IMPALA-13599
> URL: https://issues.apache.org/jira/browse/IMPALA-13599
> Project: IMPALA
> Issue Type: Improvement
> Components: Catalog
> Reporter: Sai Hemanth Gantasala
> Assignee: Sai Hemanth Gantasala
> Priority: Major
>
> There are a couple of APIs in Impala that fire ALTER_PARTITION for each
> partition.
> {code:java}
> Alter table foo set cached/uncached;
> Drop stats table_name;{code}
> These table-level operations are calling HMS APIs alter_partition() thousands
> times if there are an equal number of partitions. The side effect is that it
> leads to thousands of alter partition events.
> We should optimize to call alter_partitions() HMS API only once for these
> operations and bulk update the partitions.
> P.S: HIVE-27746 - We can ultimately leverage this to have a single
> alter_partitions event that can help the event processor catch up with lag
> quickly.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]