[
https://issues.apache.org/jira/browse/SOLR-13291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16783593#comment-16783593
]
Erick Erickson commented on SOLR-13291:
---------------------------------------
Possibly related to SOLR-13021, although 13021 is in the tests...
> Failed to create collection due to lock held by this virtual machine
> --------------------------------------------------------------------
>
> Key: SOLR-13291
> URL: https://issues.apache.org/jira/browse/SOLR-13291
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: SolrCloud
> Affects Versions: 7.5, 7.7
> Environment: * Solr 7.7.1 (also reproduced on 7.5)
> * running on Ubuntu 18.04 (also reproduced on AWS instances using Amazon
> Linux)
> * Using OpenJDK 11.0.1 as distributedby AdoptOpenJDK
> * setting up a solr example cloud using `solr start -e cloud` and accepting
> all default values (2 cluster nodes)
> Reporter: Joachim Sauer
> Priority: Major
> Attachments: tortureSolr.sh
>
>
> We have a weird workload that at some times involves deletion and re-creation
> of collections with the same name in a short period of time (don't ask why).
>
> When running in a SolrCloud cluster this will occasionally leave a random
> core lying around and locked even though the Collection deletion was reported
> to have finished successfully.
>
> This results in an error the next time a collection of that given name should
> be created.
>
> The attached shell script is consistently able to reproduce the error states
> within a small number of iterations against the 7.7.1 binary distribution
> running the default cloud example (`solr start -e cloud`, accept all default
> values).
>
> Log entries that seemed relevant to me are:
> At the time when the collection is deleted:
> {code}
> 2019-03-04 16:56:44.037 WARN (Thread-24) [c:myCollection s:shard2
> r:core_node4 x:myCollection_shard2_replica_n2] o.a.s.c.ZkController listener
> throws error
> org.apache.solr.common.SolrException: Unable to reload core
> [myCollection_shard2_replica_n2]
> at org.apache.solr.core.CoreContainer.reload(CoreContainer.java:1463)
> ~[solr-core-7.7.1.jar:7.7.1 5bf96d32f88eb8a2f5e775339885cd6ba84a3b58 - ishan
> - 2019-02-23 02:39:07]
> at
> org.apache.solr.core.SolrCore.lambda$getConfListener$20(SolrCore.java:3041)
> ~[solr-core-7.7.1.jar:7.7.1 5bf96d32f88eb8a2f5e775339885cd6ba84a3b58 - ishan
> - 2019-02-23 02:39:07]
> at
> org.apache.solr.cloud.ZkController.lambda$fireEventListeners$21(ZkController.java:2803)
> [solr-core-7.7.1.jar:7.7.1 5bf96d32f88eb8a2f5e775339885cd6ba84a3b58 - ishan
> - 2019-02-23 02:39:07]
> at java.lang.Thread.run(Thread.java:834) [?:?]
> Caused by: org.apache.solr.common.SolrException
> at org.apache.solr.core.SolrCore.<init>(SolrCore.java:1048)
> ~[solr-core-7.7.1.jar:7.7.1 5bf96d32f88eb8a2f5e775339885cd6ba84a3b58 - ishan
> - 2019-02-23 02:39:07]
> at org.apache.solr.core.SolrCore.reload(SolrCore.java:666)
> ~[solr-core-7.7.1.jar:7.7.1 5bf96d32f88eb8a2f5e775339885cd6ba84a3b58 - ishan
> - 2019-02-23 02:39:07]
> at org.apache.solr.core.CoreContainer.reload(CoreContainer.java:1439)
> ~[solr-core-7.7.1.jar:7.7.1 5bf96d32f88eb8a2f5e775339885cd6ba84a3b58 - ishan
> - 2019-02-23 02:39:07]
> ... 3 more
> Caused by: java.lang.NullPointerException
> at
> org.apache.solr.metrics.SolrMetricManager.loadShardReporters(SolrMetricManager.java:1160)
> ~[solr-core-7.7.1.jar:7.7.1 5bf96d32f88eb8a2f5e775339885cd6ba84a3b58 - ishan
> - 2019-02-23 02:39:07]
> at
> org.apache.solr.metrics.SolrCoreMetricManager.loadReporters(SolrCoreMetricManager.java:92)
> ~[solr-core-7.7.1.jar:7.7.1 5bf96d32f88eb8a2f5e775339885cd6ba84a3b58 - ishan
> - 2019-02-23 02:39:07]
> at org.apache.solr.core.SolrCore.<init>(SolrCore.java:920)
> ~[solr-core-7.7.1.jar:7.7.1 5bf96d32f88eb8a2f5e775339885cd6ba84a3b58 - ishan
> - 2019-02-23 02:39:07]
> at org.apache.solr.core.SolrCore.reload(SolrCore.java:666)
> ~[solr-core-7.7.1.jar:7.7.1 5bf96d32f88eb8a2f5e775339885cd6ba84a3b58 - ishan
> - 2019-02-23 02:39:07]
> at org.apache.solr.core.CoreContainer.reload(CoreContainer.java:1439)
> ~[solr-core-7.7.1.jar:7.7.1 5bf96d32f88eb8a2f5e775339885cd6ba84a3b58 - ishan
> - 2019-02-23 02:39:07]
> {code}
>
> Later, when trying to re-create the collection:
>
> {code}
> 2019-03-04 16:56:51.982 ERROR
> (OverseerThreadFactory-9-thread-5-processing-n:127.0.1.1:8983_solr) [ ]
> o.a.s.c.a.c.OverseerCollectionMessageHandler Error from shard:
> http://127.0.1.1:8983/solr
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
> from server at http://127.0.1.1:8983/solr: Error CREATEing SolrCore
> 'myCollection_shard2_replica_n2': Unable to create core
> [myCollection_shard2_replica_n2
> ] Caused by: Lock held by this virtual machine:
> /home/joachim/workspaces/devtools/solr-7.7.1/example/cloud/node1/solr/myCollection_shard2_replica_n2/data/index/write.lock
> at
> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:643)
> ~[solr-solrj-7.7.1.jar:7.7.1 5bf96d32f88eb8a2f5e775339885cd6ba84a3b58 -
> ishan - 2019-02-23 02:39:09]
> at
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:255)
> ~[solr-solrj-7.7.1.jar:7.7.1 5bf96d32f88eb8a2f5e775339885cd6ba84a3b58 -
> ishan - 2019-02-23 02:39:09]
> at
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:244)
> ~[solr-solrj-7.7.1.jar:7.7.1 5bf96d32f88eb8a2f5e775339885cd6ba84a3b58 -
> ishan - 2019-02-23 02:39:09]
> at
> org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1260)
> ~[solr-solrj-7.7.1.jar:7.7.1 5bf96d32f88eb8a2f5e775339885cd6ba84a3b58 - ishan
> - 2019-02-23 02:39:09]
> at
> org.apache.solr.handler.component.HttpShardHandler.lambda$submit$0(HttpShardHandler.java:173)
> ~[solr-core-7.7.1.jar:7.7.1 5bf96d32f88eb8a2f5e775339885cd6ba84a3b58 - ishan
> - 2019-02-23 02:39:07]
> at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) ~[?:?]
> at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
> at
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
> ~[metrics-core-3.2.6.jar:3.2.6]
> at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209)
> [solr-solrj-7.7.1.jar:7.7.1 5bf96d32f88eb8a2f5e775339885cd6ba84a3b58 - ishan
> - 2019-02-23 02:39:09]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> [?:?]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> [?:?]
> at java.lang.Thread.run(Thread.java:834) [?:?]
> {code}
>
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]