[
https://issues.apache.org/jira/browse/KUDU-1620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17425882#comment-17425882
]
ASF subversion and git services commented on KUDU-1620:
-------------------------------------------------------
Commit 3884a6388b2696a826b8903144ae555faa595473 in kudu's branch
refs/heads/master from Andrew Wong
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=3884a63 ]
[consensus] KUDU-1620: re-resolve consensus peers on network error
This plumbs the work from KUDU-75 into the long-lived consensus proxy,
allowing Raft peers to re-resolve on error.
This has the knock-on effect that masters starting up also re-resolve
other masters' address when attempting to fetch UUIDs, since this
process also uses consensus proxies.
Change-Id: Ibd1b68c3c14d7d8f81168e16fe450d2ffcce840b
Reviewed-on: http://gerrit.cloudera.org:8080/17868
Tested-by: Kudu Jenkins
Reviewed-by: Alexey Serbin <[email protected]>
> Consensus peer proxy hostnames should be reresolved on failure
> --------------------------------------------------------------
>
> Key: KUDU-1620
> URL: https://issues.apache.org/jira/browse/KUDU-1620
> Project: Kudu
> Issue Type: Bug
> Components: consensus
> Affects Versions: 1.0.0
> Reporter: Adar Dembo
> Priority: Major
> Labels: docker
>
> Noticed this while documenting the workflow to replace a dead master, which
> currently bypasses Raft config changes in favor of having the replacement
> master "masquerade" as the dead master via DNS changes.
> Internally we never rebuild consensus peer proxies in the event of network
> failure; we assume that the peer will return at the same location. Nominally
> this is reasonable; allowing peers to change host/port information on the fly
> is tricky and has yet to be implemented. But, we should at least retry the
> DNS resolution; not doing so forces the workflow to include steps to restart
> the existing masters, which creates a (small) availability outage.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)