[
https://issues.apache.org/jira/browse/KUDU-1605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Todd Lipcon resolved KUDU-1605.
-------------------------------
Resolution: Fixed
Fix Version/s: 1.0.0
> Blocks can be incorrectly deleted if TS crashes mid-tablet-copy
> ---------------------------------------------------------------
>
> Key: KUDU-1605
> URL: https://issues.apache.org/jira/browse/KUDU-1605
> Project: Kudu
> Issue Type: Bug
> Components: tserver
> Affects Versions: 0.10.0
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Priority: Blocker
> Fix For: 1.0.0
>
>
> There's currently a bug in the way we handle tablet copies while replacing
> existing tombstoned tablets:
> - a tablet exists in TABLET_DATA_TOMBSTONED state
> - we begin copying a new replica on top of this one
> -- this calls TabletMetadata::ReplaceSuperBlock() using the _remote_
> superblock (importantly, this remote superblock contains remote block IDs)
> - we crash mid-copy
> - on restart, we see the "TABLET_DATA_COPYING" state and "roll forward" the
> deletion of this tablet. However the block IDs here are the IDs from the
> remote machine, and we incorrectly delete a bunch of blocks.
> This has always been an issue, but was made worse in 0.10 by the fix for
> KUDU-1538. After fixing KUDU-1538, the likelihood of a remote block ID
> matching a local one is quite high, whereas before we'd _usually_ not see
> this bug.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)