[ 
https://issues.apache.org/jira/browse/SOLR-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14089078#comment-14089078
 ] 

Ramkumar Aiyengar edited comment on SOLR-6261 at 8/7/14 9:43 AM:
-----------------------------------------------------------------

This is possibly exposing a design issue with OCP. There is a watch being 
created with each {{workQueue.peekTopN}}, even if there are work items present. 
It could then fail exclusivity check, and then try again, find items and create 
one more watch..

{code}
~/builds/lucene-solr/solr/core| less out.log | grep 'Exclusivity check failed' 
| wc -l
6960
{code}

Obviously with the new code you run into issues because that creates 1000s of 
threads, but it's a problem to begin with that the code is creating 1000s of 
watches..


was (Author: andyetitmoves):
This is possibly exposing a design issue with OCP. There is a watch being 
created with each {{workQueue.peekTopN}}, even if there are work items present. 
It could then fail exclusivity check, and then try again, find items and create 
one more watch..

{code}
~/builds/lucene-solr/solr/core| less out.log | grep 'Exclusivity check failed' 
| wc -l
6960
{code}


> Run ZK watch event callbacks in parallel to the event thread
> ------------------------------------------------------------
>
>                 Key: SOLR-6261
>                 URL: https://issues.apache.org/jira/browse/SOLR-6261
>             Project: Solr
>          Issue Type: Improvement
>          Components: SolrCloud
>    Affects Versions: 4.9
>            Reporter: Ramkumar Aiyengar
>            Assignee: Mark Miller
>            Priority: Minor
>             Fix For: 5.0, 4.10
>
>
> Currently checking for leadership (due to the leader's ephemeral node going 
> away) happens in ZK's event thread. If there are many cores and all of them 
> are due leadership, then they would have to serially go through the two-way 
> sync and leadership takeover.
> For tens of cores, this could mean 30-40s without leadership before the last 
> in the list even gets to start the leadership process. If the leadership 
> process happens in a separate thread, then the cores could all take over in 
> parallel.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to