Hi All,

We have upgraded the solr version from 9.6.0 to 9.6.1 and shards are became
active. We tried to create a collection and collection creation also
success after upgrade.

During this process we noticed one of  shard was down(no active leader
found) So, we stopped all three nodes, And started node-1 and tried to do
force leader to node-1 for the shrd which was having issue. Now all the
shards became active.

Followed by started node-2 and node-3 and cluster is healthy now.


* Solr document would say, "when cluster is degraded recommended not to
upgrade the cluster version" but still we did the upgrade from 9.6.0 to
9.6.1.

Thanks Robi and Jose for your support.

Regards
Sathish P


On Thu, 5 Sept, 2024, 12:23 Jose, Manu, <m.j...@ub.uni-frankfurt.de> wrote:

> Hi ,
>
>
>
> I have raised this issue before, posting again.
>
>
>
>
>
>
>
> I have a baremetal Solr cluster with 3 Solr nodes and 3 Zookeeper nodes.
> When I tried to upgrade from 9.5.0 to 9.6.0 or 9.6.1, it was not
> successful. The below error shows that I am not able to create any new
> collections.
>
> After that, I have reverted back to Solr 9.5.0. The issue been solved.
>
> Zookeeper version: 3.4.13-6ubuntu4.1--1
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Thanks,
>
> Manu Jose
>
>
>
>
>
>
>
>
>
> -----Ursprüngliche Nachricht-----
>
> Von: Sathish Ponnusamy <sathishrp...@gmail.com>
>
> Gesendet: Donnerstag, 5. September 2024 05:20
>
> An: users@solr.apache.org
>
> Betreff: Re: SolrCore failed to load on startup and not in cluster state
>
>
>
> Hi Robi,
>
>
>
> Just to clarify, We have not modified any of the files/schema. it is just
> the same after and before restart.
>
>
>
> If we are rolling back to the 9.5 version, Any reindexing required? One
> more question, switching between the users root or solr user [provided all
> the privileges]  will it create  any issues in the solr core load?
>
>
>
> Regards
>
> Sathish Ponnusamy
>
>
>
>
>
> On Thu, Sep 5, 2024 at 8:00 AM Robi Petersen <robip...@gmail.com> wrote:
>
>
>
> > assuming a lot of things like it's a first install, not something
>
> > where you've been modifying the schema in bad ways and just restarting
>
> > with existing data type of thing, changing field definitions can be
>
> > iffy
>
> >
>
> > On Wed, Sep 4, 2024 at 7:25 PM Robi Petersen <robip...@gmail.com> wrote:
>
> >
>
> > > I mean I don't see anything that stands out above...
>
> > >
>
> > >
>
> > > On Wed, Sep 4, 2024 at 7:25 PM Robi Petersen <robip...@gmail.com>
> wrote:
>
> > >
>
> > >> did you try as the other gentleman suggested to roll back to 9.5?
>
> > >> just for the sake of expediency?
>
> > >>
>
> > >> On Wed, Sep 4, 2024 at 7:21 PM Sathish Ponnusamy <
>
> > sathishrp...@gmail.com>
>
> > >> wrote:
>
> > >>
>
> > >>> Hi Robi,
>
> > >>>
>
> > >>> Please find the details from solr.log. I noticed the same error
>
> > >>> for all the shards that were down. We have stopped the solr/zk
>
> > >>> service and
>
> > restarted.
>
> > >>> In fact we rebooted the machine as well. Though it didn't solve
>
> > >>> the problem.
>
> > >>>
>
> > >>> Please let me know if you need any other information.
>
> > >>>
>
> > >>> ---------------
>
> > >>>
>
> > >>> 2024-09-04 12:12:01.473 ERROR
>
> > >>> (coreLoadExecutor-21-thread-2-processing-geslrd1.dev
>
> > >>> .nonprod.gcp.net:6010_solr)
>
> > >>> [c:my_collection s:shard4 r:core_node59
>
> > >>> x:my_collection_shard4_replica_n56
>
> > >>> t:] o.a.s.c.CoreContainer SolrCore failed to load on startup =>
>
> > >>> org.apache.solr.cloud.ZkController$NotInClusterStateException:
>
> > >>> coreNodeName
>
> > >>> core_node59 does not exist in shard shard4, ignore the exception
>
> > >>> if the replica was deleted
>
> > >>>
>
> > >>>         at
>
> > >>>
>
> > org.apache.solr.cloud.ZkController.checkStateInZk(ZkController.java:20
>
> > 62)
>
> > >>>
>
> > >>> org.apache.solr.cloud.ZkController$NotInClusterStateException:
>
> > >>> coreNodeName
>
> > >>> core_node59 does not exist in shard shard4, ignore the exception
>
> > >>> if the replica was deleted
>
> > >>>
>
> > >>>         at
>
> > >>>
>
> > org.apache.solr.cloud.ZkController.checkStateInZk(ZkController.java:20
>
> > 62)
>
> > >>> ~[solr-core-9.6.0.jar:9.6.0
>
> > >>> f8e5a93c11267e13b7b43005a428bfb910ac6e57 - gus
>
> > >>> - 2024-04-22 23:20:52]
>
> > >>>
>
> > >>>         at
>
> > >>> org.apache.solr.cloud.ZkController.preRegister(ZkController.java:1
>
> > >>> 958)
>
> > >>> ~[solr-core-9.6.0.jar:9.6.0
>
> > >>> f8e5a93c11267e13b7b43005a428bfb910ac6e57 - gus
>
> > >>> - 2024-04-22 23:20:52]
>
> > >>>
>
> > >>>         at
>
> > >>>
>
> > >>>
>
> > org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.
>
> > java:1704)
>
> > >>> ~[solr-core-9.6.0.jar:9.6.0
>
> > >>> f8e5a93c11267e13b7b43005a428bfb910ac6e57 - gus
>
> > >>> - 2024-04-22 23:20:52]
>
> > >>>
>
> > >>>         at
>
> > >>>
>
> > >>>
>
> > org.apache.solr.core.CoreContainer.lambda$loadInternal$12(CoreContaine
>
> > r.java:1057)
>
> > >>> ~[solr-core-9.6.0.jar:9.6.0
>
> > >>> f8e5a93c11267e13b7b43005a428bfb910ac6e57 - gus
>
> > >>> - 2024-04-22 23:20:52]
>
> > >>>
>
> > >>>         at
>
> > >>>
>
> > >>>
>
> > com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.
>
> > run(InstrumentedExecutorService.java:212)
>
> > >>> ~[metrics-core-4.2.25.jar:4.2.25]
>
> > >>>
>
> > >>>         at
>
> > >>>
>
> > >>>
>
> > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executor
>
> > s.java:539)
>
> > >>> ~[?:?]
>
> > >>>
>
> > >>>         at
>
> > >>> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>
> > ~[?:?]
>
> > >>>
>
> > >>>         at
>
> > >>>
>
> > >>>
>
> > org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.la
>
> > mbda$execute$0(ExecutorUtil.java:312)
>
> > >>> ~[solr-solrj-9.6.0.jar:9.6.0
>
> > >>> f8e5a93c11267e13b7b43005a428bfb910ac6e57 - gus
>
> > >>> - 2024-04-22 23:20:52]
>
> > >>>
>
> > >>>         at
>
> > >>>
>
> > >>>
>
> > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>
> > Executor.java:1136)
>
> > >>> ~[?:?]
>
> > >>>
>
> > >>>         at
>
> > >>>
>
> > >>>
>
> > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>
> > lExecutor.java:635)
>
> > >>> ~[?:?]
>
> > >>>
>
> > >>>         at java.base/java.lang.Thread.run(Thread.java:840) [?:?]
>
> > >>>
>
> > >>> 2024-09-04 12:12:01.479 INFO
>
> > >>> (coreLoadExecutor-21-thread-2-processing-geslrd1.dev
>
> > >>> .nonprod.gcp.net:6010_solr)
>
> > >>> [c: s: r: x: t:] o.a.s.c.SolrConfig Using Lucene MatchVersion:
>
> > >>> 9.10.0
>
> > >>>
>
> > >>> 2024-09-04 12:12:01.480 INFO
>
> > >>> (coreLoadExecutor-21-thread-2-processing-geslrd1.dev
>
> > >>> .nonprod.gcp.net:6010_solr)
>
> > >>> [c: s: r: x: t:] o.a.s.s.IndexSchema Schema name=my_collection
>
> > >>>
>
> > >>> 2024-09-04 12:12:01.480 INFO
>
> > >>> (coreLoadExecutor-21-thread-2-processing-geslrd1.dev
>
> > >>> .nonprod.gcp.net:6010_solr)
>
> > >>> [c: s: r: x: t:] o.a.s.s.IndexSchema Loaded schema
>
> > >>> my_collection/1.6
>
> > with
>
> > >>> uniqueid field my_collection_unique_id
>
> > >>>
>
> > >>> ---------------------------------
>
> > >>>
>
> > >>> Sathish Ponnusamy
>
> > >>>
>
> > >>> Chennai
>
> > >>> m: + 91 9962331981
>
> > >>> e: sathishrp...@gmail.com
>
> > >>>
>
> > >>>
>
> > >>> On Wed, Sep 4, 2024 at 11:52 PM Robi Petersen <robip...@gmail.com>
>
> > >>> wrote:
>
> > >>>
>
> > >>> > could be a release bug? not likely? hopefully...
>
> > >>> >
>
> > >>> > I can't recreate it locally. But perhaps we need a 9.6.1 patch?
>
> > >>> > Any
>
> > >>> other
>
> > >>> > users out there experiencing this?
>
> > >>> >
>
> > >>> > On Wed, Sep 4, 2024 at 10:38 AM Robi Petersen
>
> > >>> > <robip...@gmail.com>
>
> > >>> wrote:
>
> > >>> >
>
> > >>> > > That's not a good sign... yeah something fatal happened
>
> > >>> > > loading
>
> > solr
>
> > >>> > core.
>
> > >>> > > need more logs to see what...
>
> > >>> > >
>
> > >>> > > On Wed, Sep 4, 2024 at 6:31 AM Jose, Manu <
>
> > >>> m.j...@ub.uni-frankfurt.de>
>
> > >>> > > wrote:
>
> > >>> > >
>
> > >>> > >>
>
> > >>> > >> Hallo Sathish P,
>
> > >>> > >>
>
> > >>> > >> I have the same issue while upgrading from 9.5.0 to 9.6.0. So
>
> > that I
>
> > >>> > have
>
> > >>> > >> reverted to 9.5.0 then the issue solved...
>
> > >>> > >>
>
> > >>> > >>
>
> > >>> > >>
>
> > >>> > >> Manu Jose
>
> > >>> > >> -----Ursprüngliche Nachricht-----
>
> > >>> > >> Von: Sathish Ponnusamy <sathishrp...@gmail.com>
>
> > >>> > >> Gesendet: Mittwoch, 4. September 2024 15:12
>
> > >>> > >> An: solr-u...@lucene.apache.org
>
> > >>> > >> Betreff: SolrCore failed to load on startup and not in
>
> > >>> > >> cluster
>
> > state
>
> > >>> > >>
>
> > >>> > >> We have created a collection and indexed the data into
>
> > >>> > >> solr-9.6.0
>
> > >>> > version
>
> > >>> > >> which is running in solr cloud mode. After a brief period the
>
> > >>> > >> solr
>
> > >>> > shards
>
> > >>> > >> went down and it was throwing an error like below. Did anyone
>
> > >>> > >> face
>
> > >>> this
>
> > >>> > >> problem earlier and what fix to be provided?
>
> > >>> > >>
>
> > >>> > >>
>
> > >>> > >>
>
> > >>> > >> *Solr config details:*
>
> > >>> > >>
>
> > >>> > >>    - Solr running on 3 VMs.
>
> > >>> > >>    - ZK running on 3 VM’s.
>
> > >>> > >>    - Solr version - 9.6.0
>
> > >>> > >>    - Zk version – 3.9.2
>
> > >>> > >>    - Authentication enabled.
>
> > >>> > >>
>
> > >>> > >>
>
> > >>> > >>
>
> > >>> > >>
>
> > >>> > >>
>
> > >>> > >> *Error from Console Logging*
>
> > >>> > >>
>
> > >>> > >>
>
> > >>> > >>
>
> > >>> > >> ERROR false       my_collection_shard1_replica_n14
>
> > >>> > >> CoreContainer              SolrCore
>
> > >>> my_collection_shard1_replica_n14 in
>
> > >>> > >> /data/my_collection_shard1_replica_n14 is not in cluster state.
>
> > >>> > >>
>
> > >>> > >> ERROR false       my_collection_shard1_replica_n14
>
> > >>> > >> CoreContainer              SolrCore failed to load on startup
>
> > >>> > >>
>
> > >>> > >>
>
> > >>> > >>
>
> > >>> > >> ERROR false       my_collection_shard3_replica_n32
>
> > >>> > >> CoreContainer              SolrCore
>
> > >>> my_collection_shard3_replica_n32 in
>
> > >>> > >> /data/my_collection_shard3_replica_n32 is not in cluster state.
>
> > >>> > >>
>
> > >>> > >> ERROR false       my_collection_shard3_replica_n32
>
> > >>> > >> CoreContainer              SolrCore failed to load on startup
>
> > >>> > >>
>
> > >>> > >>
>
> > >>> > >>
>
> > >>> > >> ERROR false       my_collection_shard4_replica_n56
>
> > >>> > >> CoreContainer              SolrCore
>
> > >>> my_collection_shard4_replica_n56 in
>
> > >>> > >> /data/my_collection_shard4_replica_n56 is not in cluster state.
>
> > >>> > >>
>
> > >>> > >>
>
> > >>> > >>
>
> > >>> > >> ERROR false       my_collection_shard4_replica_n56
>
> > >>> > >> CoreContainer              SolrCore failed to load on startup
>
> > >>> > >>
>
> > >>> > >>
>
> > >>> > >>
>
> > >>> > >> ERROR false       my_collection_shard6_replica_n10
>
> > >>> > >> CoreContainer              SolrCore
>
> > >>> my_collection_shard6_replica_n10 in
>
> > >>> > >> /data/my_collection_shard6_replica_n10 is not in cluster state.
>
> > >>> > >>
>
> > >>> > >>
>
> > >>> > >>
>
> > >>> > >> ERROR false       my_collection_shard6_replica_n10
>
> > >>> > >> CoreContainer              SolrCore failed to load on startup
>
> > >>> > >>
>
> > >>> > >>
>
> > >>> > >>
>
> > >>> > >> ERROR false       my_collection_shard8_replica_n62
>
> > >>> > >> CoreContainer              SolrCore
>
> > >>> my_collection_shard8_replica_n62 in
>
> > >>> > >> /data/my_collection_shard8_replica_n62 is not in cluster state.
>
> > >>> > >>
>
> > >>> > >>
>
> > >>> > >>
>
> > >>> > >> ERROR false       my_collection_shard8_replica_n62
>
> > >>> > >> CoreContainer              SolrCore failed to load on startup
>
> > >>> > >>
>
> > >>> > >>
>
> > >>> > >> Regards
>
> > >>> > >> Sathish P
>
> > >>> > >>
>
> > >>> > >
>
> > >>> >
>
> > >>>
>
> > >>
>
> >
>

Reply via email to