[ 
https://issues.apache.org/jira/browse/SOLR-11083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087063#comment-16087063
 ] 

Noble Paul commented on SOLR-11083:
-----------------------------------

The same command should also be able to extend the down-time by repeatedly 
invoking the command

So MOVEREPLICA for shared FS could be built with the following  steps

#   invoke TMPUNLOAD on the source replica with timeout=60secs
#  create a new replica with the same coreNodeName and coreName in target node
#  keep tracking the target node and see if it comes up within say 50 secs
#  if it's still not up, extend the lease by invoking the TMPUNLOAD command 
with timeout=60 seconds 
#  repeat until the target replica comes up or a timeout of say 300 secs
#  if target replica comes up successfully , UNLOAD the source replica 


> MoveReplica API can lose replicas for shared file systems on overseer restart 
> if source node is live
> ----------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-11083
>                 URL: https://issues.apache.org/jira/browse/SOLR-11083
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: hdfs, SolrCloud
>            Reporter: Shalin Shekhar Mangar
>             Fix For: 7.1
>
>
> MoveReplica unloads the old replica and creates a new one for shared file 
> systems. But if the overseer restarts between the two operations then the old 
> replica is lost. It is upto the user to detect the failure (using request 
> status API) and retry.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to