[
https://issues.apache.org/jira/browse/HBASE-16096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15512492#comment-15512492
]
Appy commented on HBASE-16096:
------------------------------
This new test possibly made TestReplicationSourceManagerZkImpl flaky. See [this
failure|https://builds.apache.org/job/HBASE-Flaky-Tests/3399/testReport/junit/org.apache.hadoop.hbase.replication.regionserver/TestReplicationSourceManagerZkImpl/testLogRoll/].
Am i interpreting it right that this testcase leaves the fake peer around which
then the next testcase, testLogRoll, tries to connect to.
[~ashu210890], [~Vegetable26]
cc. [~jmhsieh]
> Replication keeps accumulating znodes
> -------------------------------------
>
> Key: HBASE-16096
> URL: https://issues.apache.org/jira/browse/HBASE-16096
> Project: HBase
> Issue Type: Bug
> Components: Replication
> Affects Versions: 2.0.0, 1.2.0, 1.3.0
> Reporter: Ashu Pachauri
> Assignee: Joseph
> Fix For: 2.0.0, 1.3.0, 1.4.0
>
> Attachments: HBASE-16096-branch-1.patch, HBASE-16096.patch
>
>
> If there is an error while creating the replication source on adding the
> peer, the source if not added to the in memory list of sources but the
> replication peer is.
> However, in such a scenario, when you remove the peer, it is deleted from
> zookeeper successfully but for removing the in memory list of peers, we wait
> for the corresponding sources to get deleted (which as we said don't exist
> because of error creating the source).
> The problem here is the ordering of operations for adding/removing source and
> peer.
> Modifying the code to always remove queues from the underlying storage, even
> if there exists no sources also requires a small refactoring of
> TableBasedReplicationQueuesImpl to not abort on removeQueues() of an empty
> queue
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)