[ 
https://issues.apache.org/jira/browse/SOLR-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-6591:
----------------------------------------
    Attachment: SOLR-6591-ignore-no-collection-path.patch

{quote}
A rapid create+delete loop for collections with state format > 1 causes the 
above exception to happen. This is because the updateZkState method assumes 
that the collection exists and it tries to write to 
/collections/collection_name/state.json directly without verifying whether the 
/collections/collection_name zk node exists
{quote}

This patch ignores state messages which are trying to create new collections 
when the parent zk path doesn't exist. I've added the following comment in the 
code to explain the situation:
{quote}
                 // if the /collections/collection_name path doesn't exist then 
it means that
                  // 1) the user invoked a DELETE collection API and the 
OverseerCollectionProcessor has deleted
                  // this zk path.
                  // 2) these are most likely old "state" messages which are 
only being processed now because
                  // if they were new "state" messages then in legacy mode, a 
new collection would have been 
                  // created with stateFormat = 1 (which is the default state 
format)
                  // 3) these can't be new "state" messages created for a new 
collection because
                  // otherwise the OverseerCollectionProcessor would have 
already created this path
                  // as part of the create collection API call -- which is the 
only way in which a collection
                  // with stateFormat > 1 can possibly be created
{quote}



> Cluster state updates can be lost on exception in main queue loop
> -----------------------------------------------------------------
>
>                 Key: SOLR-6591
>                 URL: https://issues.apache.org/jira/browse/SOLR-6591
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>    Affects Versions: Trunk
>            Reporter: Shalin Shekhar Mangar
>            Assignee: Shalin Shekhar Mangar
>             Fix For: Trunk
>
>         Attachments: SOLR-6591-constructStateFix.patch, 
> SOLR-6591-ignore-no-collection-path.patch, SOLR-6591-no-mixed-batches.patch, 
> SOLR-6591.patch
>
>
> I found this bug while going through the failure on jenkins:
> https://builds.apache.org/job/Lucene-Solr-NightlyTests-trunk/648/
> {code}
> 2 tests failed.
> REGRESSION:  
> org.apache.solr.cloud.CollectionsAPIDistributedZkTest.testDistribSearch
> Error Message:
> Error CREATEing SolrCore 'halfcollection_shard1_replica1': Unable to create 
> core [halfcollection_shard1_replica1] Caused by: Could not get shard id for 
> core: halfcollection_shard1_replica1
> Stack Trace:
> org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Error 
> CREATEing SolrCore 'halfcollection_shard1_replica1': Unable to create core 
> [halfcollection_shard1_replica1] Caused by: Could not get shard id for core: 
> halfcollection_shard1_replica1
>         at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:570)
>         at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:215)
>         at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:211)
>         at 
> org.apache.solr.cloud.CollectionsAPIDistributedZkTest.testErrorHandling(CollectionsAPIDistributedZkTest.java:583)
>         at 
> org.apache.solr.cloud.CollectionsAPIDistributedZkTest.doTest(CollectionsAPIDistributedZkTest.java:205)
>         at 
> org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:869)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to