[ 
https://issues.apache.org/jira/browse/SOLR-14245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17346778#comment-17346778
 ] 

Ishan Chattopadhyaya commented on SOLR-14245:
---------------------------------------------

The production issue we encountered today caused no nodes to startup properly 
because of {{node_name}} being null for one of the replicas (out of thousands 
of others).

 
{quote}{{Replica}} is a critical piece of information, if it's invalid then 
something seriously wrong already happened.
{quote}
Yes, it means something wrong has happened. However, your fix means that no 
nodes will now start, instead of that particular collection being affected. 
This is absolutely insane.

 
{quote}That's the whole point of validation, to quickly catch errors that can 
cause long-term subtle corruption.
{quote}
Such errors are due to Solr bugs, not user actions. So, would you rather such 
an error bring down the whole system than just affect the particular 
collection? IMHO, this is making a bad situation worse.

 

This change must be immediately reverted and a breakfix release must be pushed 
for (until 8.9 can accomodate it).

> Validate Replica / ReplicaInfo on creation
> ------------------------------------------
>
>                 Key: SOLR-14245
>                 URL: https://issues.apache.org/jira/browse/SOLR-14245
>             Project: Solr
>          Issue Type: Improvement
>          Components: SolrCloud
>            Reporter: Andrzej Bialecki
>            Assignee: Andrzej Bialecki
>            Priority: Minor
>             Fix For: 8.5
>
>
> Replica / ReplicaInfo should be immutable and their fields should be 
> validated on creation.
> Some users reported that very rarely during a failed collection CREATE or 
> DELETE, or when the Overseer task queue becomes corrupted, Solr may write to 
> ZK incomplete replica infos (eg. node_name = null).
> This problem is difficult to reproduce but we should add safeguards anyway to 
> prevent writing such corrupted replica info to ZK.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to