[ https://issues.apache.org/jira/browse/HIVE-17196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16166490#comment-16166490 ]
Sankar Hariappan commented on HIVE-17196: ----------------------------------------- +1, patch looks good to me. cc [~daijy] > CM: ReplCopyTask should retain the original file names even if copied from CM > path. > ----------------------------------------------------------------------------------- > > Key: HIVE-17196 > URL: https://issues.apache.org/jira/browse/HIVE-17196 > Project: Hive > Issue Type: Sub-task > Components: repl > Affects Versions: 2.1.0 > Reporter: Sankar Hariappan > Assignee: Daniel Dai > Fix For: 3.0.0 > > Attachments: HIVE-17196.1.patch > > > Consider the below scenario, > 1. Insert into table T1 with value(X). > 2. Insert into table T1 with value(X). > 3. Truncate the table T1. > – This step backs up 2 files with same content to cmroot which ends up with > one file in cmroot as checksum matches. > 4. Incremental repl with above 3 operations. > – In this step, both the insert event files will be read from cmroot where > copy of one leads to overwrite the other one as the file name is same in cm > path (checksum as file name). > So, this leads to data loss and hence it is necessary to retain the original > file names even if we copy from cm path. -- This message was sent by Atlassian JIRA (v6.4.14#64029)