[ https://issues.apache.org/jira/browse/HIVE-18772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16592292#comment-16592292 ]
Eugene Koifman commented on HIVE-18772: --------------------------------------- HIVE-20459 would be nice to have here > Make Acid Cleaner use MIN_HISTORY_LEVEL > --------------------------------------- > > Key: HIVE-18772 > URL: https://issues.apache.org/jira/browse/HIVE-18772 > Project: Hive > Issue Type: Improvement > Components: Transactions > Affects Versions: 3.0.0 > Reporter: Eugene Koifman > Assignee: Eugene Koifman > Priority: Major > > Instead of using Lock Manager state as it currently does. > This will eliminate possible race conditions > See this > [comment|https://issues.apache.org/jira/browse/HIVE-18192?focusedCommentId=16338208&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16338208] > Suppose A is the set of all ValidTxnList across all active readers. Each > ValidTxnList has minOpenTxnId. > MIN_HISTORY_LEVEL allows us to determine X = min(minOpenTxnId) across all > currently active readers > This means that no active transaction in the system sees any txn with txnid < > X as open. > This means if construct ValidTxnIdList with HWM=X-1 and use that in > getAcidState(), any files determined by this call as 'obsolete', will be seen > as obsolete by any existing/future reader, i.e. can be physically deleted. > This is also necessary for multi-statement transactions where relying on the > state of Lock Manager is not sufficient. For example > Suppose txn 17 starts at t1 and sees txnid 13 with writeID 13 open. > 13 commits (via it's parent txn) at t2 > t1. (17 is still running). > Compaction runs at t3 >t2 to produce base_14 (or delta_10_14 for example) on > Table1/Part1 (17 is still running) > Now delta_13 may be cleaned since it can be seen as obsolete and there may be > no locks on it, i.e. no one is reading it. > Now at t4 > t3 17 may (multi stmt txn) needs to read Table1/Part1. It cannot > use base_14 is that may have absorbed delete events from delete_delta_14. > Using MIN_HISTORY_LEVEL solves this. > See description of HIVE-18747 for more details on MIN_HISTORY_LEVEL -- This message was sent by Atlassian JIRA (v7.6.3#76005)