[ 
https://issues.apache.org/jira/browse/SOLR-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-6591:
----------------------------------------
    Attachment: SOLR-6591-no-mixed-batches.patch

Right now main cluster states are batched together and updates to collections 
with stateFormat > 1 are not batched (I'll create another issue for that). 
However updates to both can be mixed together e.g. if overseer gets 5 messages 
for main cluster state and then 1 for stateFormat > 1 then the resulting 
updates are written to ZK together. This is error prone and we shouldn't batch 
updates for different stateFormats together.

This patch tracks the last stateFormat for which message was processed and 
breaks out of the loop if a different one is encountered.

> Cluster state updates can be lost on exception in main queue loop
> -----------------------------------------------------------------
>
>                 Key: SOLR-6591
>                 URL: https://issues.apache.org/jira/browse/SOLR-6591
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>    Affects Versions: Trunk
>            Reporter: Shalin Shekhar Mangar
>            Assignee: Shalin Shekhar Mangar
>             Fix For: Trunk
>
>         Attachments: SOLR-6591-constructStateFix.patch, 
> SOLR-6591-no-mixed-batches.patch, SOLR-6591.patch
>
>
> I found this bug while going through the failure on jenkins:
> https://builds.apache.org/job/Lucene-Solr-NightlyTests-trunk/648/
> {code}
> 2 tests failed.
> REGRESSION:  
> org.apache.solr.cloud.CollectionsAPIDistributedZkTest.testDistribSearch
> Error Message:
> Error CREATEing SolrCore 'halfcollection_shard1_replica1': Unable to create 
> core [halfcollection_shard1_replica1] Caused by: Could not get shard id for 
> core: halfcollection_shard1_replica1
> Stack Trace:
> org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Error 
> CREATEing SolrCore 'halfcollection_shard1_replica1': Unable to create core 
> [halfcollection_shard1_replica1] Caused by: Could not get shard id for core: 
> halfcollection_shard1_replica1
>         at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:570)
>         at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:215)
>         at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:211)
>         at 
> org.apache.solr.cloud.CollectionsAPIDistributedZkTest.testErrorHandling(CollectionsAPIDistributedZkTest.java:583)
>         at 
> org.apache.solr.cloud.CollectionsAPIDistributedZkTest.doTest(CollectionsAPIDistributedZkTest.java:205)
>         at 
> org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:869)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to