[ https://issues.apache.org/jira/browse/HIVE-26632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17617980#comment-17617980 ]
Chris Nauroth commented on HIVE-26632: -------------------------------------- Update DelegationTokenSecretManager current key ID to prevent erroneous database updates. {{TokenStoreDelegationTokenSecretManager#logUpdateMasterKey}}, used in combination with {{DBTokenStore}}, inserts a new master key to the {{MASTER_KEYS}} table. The serialized {{DelegationKey}} stored in the {{MASTER_KEY}} column initially will contain an ID with the value of the base class member {{AbstractDelegationTokenSecretManager#currentId}}. For example, for a freshly started HiveMetaStore process, this will use a value of 1. Then, there is a second update performed on the database row, setting a new serialized {{DelegationKey}} with an ID that matches the value of the auto-incrementing {{KEY_ID}} column. Note that this method does not update {{AbstractDelegationTokenSecretManager#currentId}} for agreement with the new key ID. {{TokenStoreDelegationTokenSecretManager#rollMasterKeyExt}} scans all rows in {{MASTER_KEYS}}. If it finds a serialized {{MASTER_KEY}} with an ID that matches its current value for {{AbstractDelegationTokenSecretManager#currentId}}, then it will update the database row. We have observed a race condition while running multiple HiveMetaStore instances sharing the same database. The steps performed in {{logUpdateMasterKey}} are not transactional. It's possible that {{rollMasterKeyExt}} running in HiveMetaStore A scans a newly inserted row from {{logUpdateMasterKey}} running in HiveMetaStore B that has not had the ID updated to the correct value yet. If that ID matches the {{currentId}} in HiveMetaStore A, then it will attempt to update, but the ID in the update query won't match any row's {{KEY_ID}}. The update fails with an exception: {code} 2022-08-03T00:09:48,744 ERROR [Thread[Thread-9,5,main]] thrift.TokenStoreDelegationTokenSecretManager: ExpiredTokenRemover thread received unexpected exception. org.apache.hadoop.hive.thrift.DelegationTokenStore$TokenStoreException: NoSuchObjectException(message:No key found with keyId: 1) org.apache.hadoop.hive.thrift.DelegationTokenStore$TokenStoreException: NoSuchObjectException(message:No key found with keyId: 1) at org.apache.hadoop.hive.thrift.DBTokenStore.invokeOnTokenStore(DBTokenStore.java:170) ~[hive-exec-2.3.7.jar:2.3.7] at org.apache.hadoop.hive.thrift.DBTokenStore.updateMasterKey(DBTokenStore.java:51) ~[hive-exec-2.3.7.jar:2.3.7] at org.apache.hadoop.hive.thrift.TokenStoreDelegationTokenSecretManager.rollMasterKeyExt(TokenStoreDelegationTokenSecretManager.java:269) ~[hive-exec-2.3.7.jar:2.3.7] at org.apache.hadoop.hive.thrift.TokenStoreDelegationTokenSecretManager$ExpiredTokenRemover.run(TokenStoreDelegationTokenSecretManager.java:301) [hive-exec-2.3.7.jar:2.3.7] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_312] Caused by: org.apache.hadoop.hive.metastore.api.NoSuchObjectException: No key found with keyId: 1 at org.apache.hadoop.hive.metastore.ObjectStore.updateMasterKey(ObjectStore.java:7727) ~[hive-exec-2.3.7.jar:2.3.7] {code} When this exception happens, {{ExpiredTokenRemoverThread}} will not update {{lastMasterKeyUpdate}}, so it immediately tries again to create a new master key, causing increased database load and extraneous rows in the {{MASTER_KEYS}} table. This problem can be prevented if {{logUpdateMasterKey}} also updates the base class {{currentId}} to the correct ID value. > Update DelegationTokenSecretManager current key ID to prevent erroneous > database updates. > ----------------------------------------------------------------------------------------- > > Key: HIVE-26632 > URL: https://issues.apache.org/jira/browse/HIVE-26632 > Project: Hive > Issue Type: Bug > Components: Standalone Metastore > Reporter: Chris Nauroth > Assignee: Chris Nauroth > Priority: Major > > While rolling a new master key, {{TokenStoreDelegationTokenSecretManager}} > does not update a base class member variable that tracks the current key ID. > This can cause situations later where it attempts to update a key using an > incorrect ID. This update attempt fails, even though the process had > successfully generated a new master key. Since it appears to be a failure > though, the thread immediately attempts to roll a new master key again, > resulting in excess database load. -- This message was sent by Atlassian Jira (v8.20.10#820010)