[ 
https://issues.apache.org/jira/browse/SOLR-5811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13920023#comment-13920023
 ] 

Mark Miller commented on SOLR-5811:
-----------------------------------

When the Overseer was first considered, one of the primary ideas was that 
commands could fail over if not completed or be retried on failures, etc. A lot 
of this is not there yet. The Overseer will actually retry failed commands now, 
but it's much to dumb about it - a command that cannot or will not succeed will 
tie up the whole processing pipeline.

In the short term, I don't think we should retry most work items. 

Longer term, it seems like we should track retries and perhaps give up at some 
point - or something smart than retrying as fast as possible until success.

> The Overseer will retry work items until success.
> -------------------------------------------------
>
>                 Key: SOLR-5811
>                 URL: https://issues.apache.org/jira/browse/SOLR-5811
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>            Reporter: Mark Miller
>            Assignee: Mark Miller
>             Fix For: 4.8, 5.0
>
>
> This means that if you get a bad item in the ZK distributed queue, it can 
> lock up your Overseer as it continuously retries the bad command. The 
> workaround is to manually clear the Overseer ZK queue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to