[ https://issues.apache.org/jira/browse/HIVE-22336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16955019#comment-16955019 ]
Jesus Camacho Rodriguez commented on HIVE-22336: ------------------------------------------------ +1 > Updates should be pushed to the Metastore backend DB before creating the > notification event > ------------------------------------------------------------------------------------------- > > Key: HIVE-22336 > URL: https://issues.apache.org/jira/browse/HIVE-22336 > Project: Hive > Issue Type: Bug > Components: Metastore > Affects Versions: 4.0.0 > Reporter: Marta Kuczora > Assignee: Marta Kuczora > Priority: Major > Attachments: HIVE-22336.1.patch, HIVE-22336.2.patch, > HIVE-22336.3.patch > > > There was an issue on HDP-3.1 where a table couldn't be deleted, because some > related objects (like storage descriptor) were missing from the metastore. > There was a previous delete attempt on that table which went wrong, but no > rollback happened, that's why the SD were missing. In that previous delete, > the notification creation swallowed the error which came from the backend DB, > that's why no rollback happened. Here are the steps which happened in the > first delete attempt: > > # Open a transaction (transaction_1) - this step was successful > # Delete all the objects which are related to the table - this step was > successful too, so the SD and other objects were deleted > # Delete the table - this step failed in the backend DB, but according to the > log the delete happens in a batch statement, so it won't necessarily be > executed right at this moment, so we won't see an error here > # Create a notification about the table delete: > ## Open an other transaction for the notification creation (transaction_2) - > call the ObjectStore.openTransaction method which increases a counter for > open transactions and then checks if there is already an active transaction. > If there is, then just returns true and doesn't really create a new > transaction. > ## Lock the notification id in the metastore backend db for update - here is > where the exception from the backend DB (let's call it "MySQL Exception") > manifests > ## If an exception occurs during acquiring the log, retry - The "MySQL > Exception" was caught and since there is no check on the exception, the retry > mechanism thinks that it happened because couldn't acquire the log for the > notification id, so retries and "forgot" about the "MySQL Exception". > ## If the lock was acquired successfully, create the notification - Second > time, the lock was acquired successfully, so the notification creation was > successful. > ## Commit transaction_2 - Just decrease the transaction counter, but doesn't > actually commits anything. > # Commit transaction_1 - This commits the transaction, but since the error > already got manifested and kind of "handled", here we won't see any error, > just that the commit was successful, so no rollback happens and leaves the > table object in an invalid state. > # If the commit was not successful then rollback > In the customer setup, this issue could be fixed by adding a flush call > before creating the notification event, so all the updates would be pushed to > the backend db and the error would manifest at this point. With this, the > error would go back to the HiveMetastore class which would do the rollback > and the delete table operation would fail as it should be, since the table > couldn't be deleted. But then the Hivemetastore retry mechanism could try the > table deletion again. -- This message was sent by Atlassian Jira (v8.3.4#803005)