[ https://issues.apache.org/jira/browse/HIVE-21718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16845002#comment-16845002 ]
Aihua Xu edited comment on HIVE-21718 at 5/21/19 4:08 PM: ---------------------------------------------------------- [~ngangam] Sorry for the late reply. I will take a look as well. was (Author: aihuaxu): [~ngangam] Sorry for the late reply. I will take a look. > Improvement performance of UpdateInputAccessTimeHook > ---------------------------------------------------- > > Key: HIVE-21718 > URL: https://issues.apache.org/jira/browse/HIVE-21718 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 > Affects Versions: 2.1.1 > Reporter: Naveen Gangam > Assignee: Naveen Gangam > Priority: Major > Attachments: HIVE-21718.2.patch, HIVE-21718.patch > > > Currently, Hive does not update the lastAccessTime property for any entities > when a query accesses them. Thus it has not possible to know when a table was > last accessed. > Hive does provide a configurable hook to HS2 that is execcuted as a pre-query > hook prior to the query being executed. However, this hook is inefficient > because for each table or partition it is attempting to update time for, it > executes an "alter table ... " command internally. This is bad > 1) For a query touching 1000's of partitions, this hook takes forever to > update them. > 2) Meanwhile, it is holding up the original query from executing. > So even though we do not recommend using the hook, because the reward is too > little (having lastAccessTime updated), we realize there is no other means to > achieve this. > Also, we can improve the performance of the hook significantly by adding a > new thrift API on HMS to update the lastAccessTime on the database rows > directly instead of going to HMS front end for 1 entity at time (leading to > 1000's of HMS calls that lead to multiple 1000's of calls to the database). -- This message was sent by Atlassian JIRA (v7.6.3#76005)