[ https://issues.apache.org/jira/browse/HIVE-20264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576746#comment-16576746 ]
Hive QA commented on HIVE-20264: -------------------------------- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 22s{color} | {color:red} /data/hiveptest/logs/PreCommit-HIVE-Build-13153/patches/PreCommit-HIVE-Build-13153.patch does not apply to master. Rebase required? Wrong Branch? See http://cwiki.apache.org/confluence/display/Hive/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-13153/yetus.txt | | Powered by | Apache Yetus http://yetus.apache.org | This message was automatically generated. > Bootstrap repl dump with concurrent write and drop of ACID table makes target > inconsistent. > ------------------------------------------------------------------------------------------- > > Key: HIVE-20264 > URL: https://issues.apache.org/jira/browse/HIVE-20264 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, repl > Affects Versions: 4.0.0, 3.2.0 > Reporter: Sankar Hariappan > Assignee: Sankar Hariappan > Priority: Major > Labels: DR, pull-request-available, replication > Fix For: 4.0.0 > > Attachments: HIVE-20264.01-branch-3.patch, HIVE-20264.01.patch, > HIVE-20264.02.patch > > > During bootstrap dump of ACID tables, let's consider the below sequence. > - Get lastReplId = last event ID logged. > - Current session (Thread-1), REPL DUMP -> Open txn (Txn1) - Event-10 > - Another session (Thread-2), Open txn (Txn2) - Event-11 > - Thread-2 -> Insert data (T1.D1) to ACID table. - Event-12 > - Thread-2 -> Commit Txn (Txn2) - Event-13 > - Thread-2 -> Drop table (T1) - Event-14 > - Thread-1 -> Dump ACID tables based on current list of tables. So, T1 will > be missing. > - Thread-1 -> Commit Txn (Txn1) > - REPL LOAD from bootstrap dump will skip T1. > - Incremental REPL DUMP will start from Event-10 and hence allocate write id > for table T1 and drop table(T1) is idempotent. So, at target, exist entries > in TXN_TO_WRITE_ID and NEXT_WRITE_ID metastore tables. > - Now, when we create another table at source with same name T1 and > replicate, then it may lead to incorrect data for readers at target on T1. > Couple of proposals: > 1. Make allocate write ID idempotent which is not possible as table doesn't > exist and MM table import may lead to allocate write id before creating > table. So, cannot differentiate these 2 cases. > 2. Make Drop table event to drop entries from TXN_TO_WRITE_ID and > NEXT_WRITE_ID tables irrespective of table exist or not at target. -- This message was sent by Atlassian JIRA (v7.6.3#76005)