[ 
https://issues.apache.org/jira/browse/HIVE-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088069#comment-16088069
 ] 

Sankar Hariappan edited comment on HIVE-16990 at 7/14/17 9:10 PM:
------------------------------------------------------------------

Added 01.patch with below updates.
- The setting of current repl state by TableSerializer and PartitionSerializer 
is limited to only bootstrap dump. In case of incremental dump, this is done by 
load.
- Repl load track the metadata objects modified using newly 
UpdatedMetadataTracker object. This replaces the dbsUpdated and tablesUpdated 
maps.
- Added additional alter tasks to update the current repl state of the updated 
metadata objects. All these alter tasks are added after applying each event. 
This increased the number of tasks for each event. The overall execution time 
of replication test cases also increased due to this. Will try to optimise 
later.
- Made ReplCopyTasks to throw error if any of the listed file is missing from 
both original path and cmpath. Corrected the test cases to handle this failure 
case.
- Removed unused or dead code wherever found.
- Added a new test case to verify the repl status on failure and ensure if 
retry of failed dump works after fix.

Request [~daijy]/[~sushanth]/[~anishek]/[~thejas] to review the patch!




was (Author: sankarh):
Added 01.patch with below updates.
- The setting of current repl state by TableSerializer and PartitionSerializer 
is limited to only bootstrap dump. In case of incremental dump, this is done by 
load.
- Repl load track the metadata objects modified using newly 
UpdatedMetadataTracker object. This replaces the dbsUpdated and tablesUpdated 
maps.
- Added additional alter tasks to update the current repl state of the updated 
metadata objects. All these alter tasks are added after applying each event. 
This increased the number of tasks for each event. The overall execution time 
of replication test cases also increased due to this. Will try to optimise 
later.
- Made ReplCopyTasks to throw error if any of the listed file is missing from 
both original path and cmpath. Corrected the test cases to handle this failure 
case.
- Removed unused or dead code wherever found.

Request [~daijy]/[~sushanth]/[~anishek]/[~thejas] to review the patch!



> REPL LOAD should update last repl ID only after successful copy of data files.
> ------------------------------------------------------------------------------
>
>                 Key: HIVE-16990
>                 URL: https://issues.apache.org/jira/browse/HIVE-16990
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Hive, repl
>    Affects Versions: 2.1.0
>            Reporter: Sankar Hariappan
>            Assignee: Sankar Hariappan
>              Labels: DR, replication
>             Fix For: 3.0.0
>
>         Attachments: HIVE-16990.01.patch
>
>
> For REPL LOAD operations that includes both metadata and data changes should 
> follow the below rule.
> 1. Copy the metadata excluding the last repl ID.
> 2. Copy the data files
> 3. If Step 1 and 2 are successful, then update the last repl ID of the object.
> This rule will allow the the failed events to be re-applied by REPL LOAD and 
> ensures no data loss due to failures.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to