[
https://issues.apache.org/jira/browse/SOLR-5811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13920023#comment-13920023
]
Mark Miller commented on SOLR-5811:
-----------------------------------
When the Overseer was first considered, one of the primary ideas was that
commands could fail over if not completed or be retried on failures, etc. A lot
of this is not there yet. The Overseer will actually retry failed commands now,
but it's much to dumb about it - a command that cannot or will not succeed will
tie up the whole processing pipeline.
In the short term, I don't think we should retry most work items.
Longer term, it seems like we should track retries and perhaps give up at some
point - or something smart than retrying as fast as possible until success.
> The Overseer will retry work items until success.
> -------------------------------------------------
>
> Key: SOLR-5811
> URL: https://issues.apache.org/jira/browse/SOLR-5811
> Project: Solr
> Issue Type: Bug
> Components: SolrCloud
> Reporter: Mark Miller
> Assignee: Mark Miller
> Fix For: 4.8, 5.0
>
>
> This means that if you get a bad item in the ZK distributed queue, it can
> lock up your Overseer as it continuously retries the bad command. The
> workaround is to manually clear the Overseer ZK queue.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]