psalagnac opened a new pull request, #3350:
URL: https://github.com/apache/solr/pull/3350

   
   https://issues.apache.org/jira/browse/SOLR-17754
   
   
   # Description
   
   Stuck overseer that sometimes happens under high load, when the overseer has 
at least 100 running tasks.
   
   See [Jira](https://issues.apache.org/jira/browse/SOLR-17754) for the full 
scenario as description is pretty long.
   
   # Solution
   
   This fixes the overseer main loop so we never submit more than 100 
concurrent tasks to the thread pool. Instead of manually tracking when a task 
is complete, we check the status using a standard java `Future`.
   
   The changes also makes sure we don't write the result to ZK response node 
when we should not (see 
[comment](https://issues.apache.org/jira/browse/SOLR-17754?focusedCommentId=17951216&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17951216)),
 removing the erroneous occurrences of log `"Response ZK path: <node> doesn't 
exist. Requestor may have disconnected from ZooKeeper"`
   
   # Tests
   
   Add a new test to make sure we don't fail anymore with lot of tasks.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to