Soft commit takes 5 seconds in Solr 8.9.0
Hello We are using Solr 8.9.0. We have configured Solr cloud like 2 shards and each shard has one replica. We have used 5 zoo keepers for Solr cloud. We have used the below schema field in employee collection. *Total no of record*: 8562099 *Size of instance:* solrgnrls2r167GB solrgnrls166 GB solrgnrls1r166 GB solrgnrls268 GB *Solr logs:* 2023-09-14 10:04:30.705 DEBUG (qtp1984975621-8805766) [c:forms s:shard1 r:core_node3 x:forms_shard1_replica_n1] o.a.s.u.DirectUpdateHandler2 updateDocuments(add{_version_=1777003156686766080,id=EMP5487098118986160}) 2023-09-14 10:04:30.710 INFO (qtp1984975621-8805766) [c:forms s:shard1 r:core_node3 x:forms_shard1_replica_n1] o.a.s.u.p.LogUpdateProcessorFactory [forms_shard1_replica_n1] webapp=/solr path=/update params={wt=javabin&version=2}{add=[FORM5487098118986160 (1777003156686766080)]} 0 5 2023-09-14 10:04:30.807 DEBUG (commitScheduler-930-thread-1) [c:employee s:shard1 r:core_node3 x:employee_shard1_replica_n1] o.a.s.u.DirectUpdateHandler2 start commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=true,prepareCommit=false} 2023-09-14 10:04:35.134 DEBUG (commitScheduler-930-thread-1) [c:employee s:shard1 r:core_node3 x:employee_shard1_replica_n1] o.a.s.s.SolrIndexSearcher Opening [Searcher@796ab9b9[employee_shard1_replica_n1] main] 2023-09-14 10:04:35.413 DEBUG (commitScheduler-930-thread-1) [c:employee s:shard1 r:core_node3 x:employee_shard1_replica_n1] o.a.s.u.DirectUpdateHandler2 end_commit_flush Why is the commitScheduler thread taking 5 seconds to complete? Due to this, we can not see the latest update for id EMP5487098118986160. We also have another collection with an index size of 120 GB and the number of documents is 744620373 but still, there is no slowness in the soft commit. When we checked Solr source code we found that time spent in ExitableDirectoryReader.wrap(UninvertingReader.wrap(reader, core.getLatestSchema().getUninversionMapper()),SolrQueryTimeoutImpl.getInstance()); this.leafReader = SlowCompositeReaderWrapper.wrap(this.reader); How can we troubleshoot the issue?
Re: Soft commit takes 5 seconds in Solr 8.9.0
How many docs have you added before the softCommit? Do you use any cache warming or other commit hooks? Jan > 28. sep. 2023 kl. 13:28 skrev John Jackson : > > Hello > > We are using Solr 8.9.0. We have configured Solr cloud like 2 shards and > each shard has one replica. We have used 5 zoo keepers for Solr cloud. > > We have used the below schema field in employee collection. name="id" type="string" indexed="true" stored="true" required="true" > multiValued="false" docValues="true"/> indexed="true" stored="true" multiValued="true"/> > > > > *Total no of record*: 8562099 > *Size of instance:* solrgnrls2r167GB solrgnrls166 GB > solrgnrls1r166 GB solrgnrls268 GB > > *Solr logs:* 2023-09-14 10:04:30.705 DEBUG (qtp1984975621-8805766) [c:forms > s:shard1 r:core_node3 x:forms_shard1_replica_n1] > o.a.s.u.DirectUpdateHandler2 > updateDocuments(add{_version_=1777003156686766080,id=EMP5487098118986160}) > 2023-09-14 10:04:30.710 INFO (qtp1984975621-8805766) [c:forms s:shard1 > r:core_node3 x:forms_shard1_replica_n1] o.a.s.u.p.LogUpdateProcessorFactory > [forms_shard1_replica_n1] webapp=/solr path=/update > params={wt=javabin&version=2}{add=[FORM5487098118986160 > (1777003156686766080)]} 0 5 2023-09-14 10:04:30.807 DEBUG > (commitScheduler-930-thread-1) [c:employee s:shard1 r:core_node3 > x:employee_shard1_replica_n1] o.a.s.u.DirectUpdateHandler2 start > commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=true,prepareCommit=false} > 2023-09-14 10:04:35.134 DEBUG (commitScheduler-930-thread-1) [c:employee > s:shard1 r:core_node3 x:employee_shard1_replica_n1] > o.a.s.s.SolrIndexSearcher Opening > [Searcher@796ab9b9[employee_shard1_replica_n1] > main] 2023-09-14 10:04:35.413 DEBUG (commitScheduler-930-thread-1) > [c:employee s:shard1 r:core_node3 x:employee_shard1_replica_n1] > o.a.s.u.DirectUpdateHandler2 end_commit_flush > > > Why is the commitScheduler thread taking 5 seconds to complete? Due to > this, we can not see the latest update for id EMP5487098118986160. We also > have another collection with an index size of 120 GB and the number of > documents is 744620373 but still, there is no slowness in the soft commit. > > When we checked Solr source code we found that time spent in > ExitableDirectoryReader.wrap(UninvertingReader.wrap(reader, > core.getLatestSchema().getUninversionMapper()),SolrQueryTimeoutImpl.getInstance()); > this.leafReader = SlowCompositeReaderWrapper.wrap(this.reader); > > How can we troubleshoot the issue?
Re: Soft commit takes 5 seconds in Solr 8.9.0
How many docs have you added before the softCommit? >> only one record EMP5487098118986160 added. Do you use any cache warming or other commit hooks? >> No we are not using any cache and our commit for solr config below 60 2 false 100} We are indexing via zoo keeper and do not commit after indexing all because we have configured in solr config xml. On Thu, Sep 28, 2023 at 5:15 PM Jan Høydahl wrote: > How many docs have you added before the softCommit? > Do you use any cache warming or other commit hooks? > > Jan > > > 28. sep. 2023 kl. 13:28 skrev John Jackson : > > > > Hello > > > > We are using Solr 8.9.0. We have configured Solr cloud like 2 shards and > > each shard has one replica. We have used 5 zoo keepers for Solr cloud. > > > > We have used the below schema field in employee collection. > name="id" type="string" indexed="true" stored="true" required="true" > > multiValued="false" docValues="true"/> type="text" > > indexed="true" stored="true" multiValued="true"/> > > > > > > > > *Total no of record*: 8562099 > > *Size of instance:* solrgnrls2r167GB solrgnrls166 GB > > solrgnrls1r166 GB solrgnrls268 GB > > > > *Solr logs:* 2023-09-14 10:04:30.705 DEBUG (qtp1984975621-8805766) > [c:forms > > s:shard1 r:core_node3 x:forms_shard1_replica_n1] > > o.a.s.u.DirectUpdateHandler2 > > > updateDocuments(add{_version_=1777003156686766080,id=EMP5487098118986160}) > > 2023-09-14 10:04:30.710 INFO (qtp1984975621-8805766) [c:forms s:shard1 > > r:core_node3 x:forms_shard1_replica_n1] > o.a.s.u.p.LogUpdateProcessorFactory > > [forms_shard1_replica_n1] webapp=/solr path=/update > > params={wt=javabin&version=2}{add=[FORM5487098118986160 > > (1777003156686766080)]} 0 5 2023-09-14 10:04:30.807 DEBUG > > (commitScheduler-930-thread-1) [c:employee s:shard1 r:core_node3 > > x:employee_shard1_replica_n1] o.a.s.u.DirectUpdateHandler2 start > > > commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=true,prepareCommit=false} > > 2023-09-14 10:04:35.134 DEBUG (commitScheduler-930-thread-1) [c:employee > > s:shard1 r:core_node3 x:employee_shard1_replica_n1] > > o.a.s.s.SolrIndexSearcher Opening > > [Searcher@796ab9b9[employee_shard1_replica_n1] > > main] 2023-09-14 10:04:35.413 DEBUG (commitScheduler-930-thread-1) > > [c:employee s:shard1 r:core_node3 x:employee_shard1_replica_n1] > > o.a.s.u.DirectUpdateHandler2 end_commit_flush > > > > > > Why is the commitScheduler thread taking 5 seconds to complete? Due to > > this, we can not see the latest update for id EMP5487098118986160. We > also > > have another collection with an index size of 120 GB and the number of > > documents is 744620373 but still, there is no slowness in the soft > commit. > > > > When we checked Solr source code we found that time spent in > > ExitableDirectoryReader.wrap(UninvertingReader.wrap(reader, > > > core.getLatestSchema().getUninversionMapper()),SolrQueryTimeoutImpl.getInstance()); > > this.leafReader = SlowCompositeReaderWrapper.wrap(this.reader); > > > > How can we troubleshoot the issue? > >
Re: Soft commit takes 5 seconds in Solr 8.9.0
> 100} There seems to be a typo here with the "}"? It is unusual with 100ms commit time, you risk that commits pile up during rapid indexing and cause inefficiencies. I'd increase it to at least 1000ms. Can you reproduce this in an IDLE system by simply adding ONE document? What does your document look like? Number of fields, size, nested docs etc? Does it happen every time or just once in a while? Do you have access to system metrics for the server and jvm which can tell something about its general health and load? Jan > 28. sep. 2023 kl. 13:54 skrev John Jackson : > > How many docs have you added before the softCommit? > >>> only one record EMP5487098118986160 added. > > Do you use any cache warming or other commit hooks? > >>> No we are not using any cache and our commit for solr config below > > > 60 > 2 > false > > > 100} > > > We are indexing via zoo keeper and do not commit after indexing all because > we have configured in solr config xml. > > On Thu, Sep 28, 2023 at 5:15 PM Jan Høydahl wrote: > >> How many docs have you added before the softCommit? >> Do you use any cache warming or other commit hooks? >> >> Jan >> >>> 28. sep. 2023 kl. 13:28 skrev John Jackson : >>> >>> Hello >>> >>> We are using Solr 8.9.0. We have configured Solr cloud like 2 shards and >>> each shard has one replica. We have used 5 zoo keepers for Solr cloud. >>> >>> We have used the below schema field in employee collection. >> name="id" type="string" indexed="true" stored="true" required="true" >>> multiValued="false" docValues="true"/> > type="text" >>> indexed="true" stored="true" multiValued="true"/> >>> >>> >>> >>> *Total no of record*: 8562099 >>> *Size of instance:* solrgnrls2r167GB solrgnrls166 GB >>> solrgnrls1r166 GB solrgnrls268 GB >>> >>> *Solr logs:* 2023-09-14 10:04:30.705 DEBUG (qtp1984975621-8805766) >> [c:forms >>> s:shard1 r:core_node3 x:forms_shard1_replica_n1] >>> o.a.s.u.DirectUpdateHandler2 >>> >> updateDocuments(add{_version_=1777003156686766080,id=EMP5487098118986160}) >>> 2023-09-14 10:04:30.710 INFO (qtp1984975621-8805766) [c:forms s:shard1 >>> r:core_node3 x:forms_shard1_replica_n1] >> o.a.s.u.p.LogUpdateProcessorFactory >>> [forms_shard1_replica_n1] webapp=/solr path=/update >>> params={wt=javabin&version=2}{add=[FORM5487098118986160 >>> (1777003156686766080)]} 0 5 2023-09-14 10:04:30.807 DEBUG >>> (commitScheduler-930-thread-1) [c:employee s:shard1 r:core_node3 >>> x:employee_shard1_replica_n1] o.a.s.u.DirectUpdateHandler2 start >>> >> commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=true,prepareCommit=false} >>> 2023-09-14 10:04:35.134 DEBUG (commitScheduler-930-thread-1) [c:employee >>> s:shard1 r:core_node3 x:employee_shard1_replica_n1] >>> o.a.s.s.SolrIndexSearcher Opening >>> [Searcher@796ab9b9[employee_shard1_replica_n1] >>> main] 2023-09-14 10:04:35.413 DEBUG (commitScheduler-930-thread-1) >>> [c:employee s:shard1 r:core_node3 x:employee_shard1_replica_n1] >>> o.a.s.u.DirectUpdateHandler2 end_commit_flush >>> >>> >>> Why is the commitScheduler thread taking 5 seconds to complete? Due to >>> this, we can not see the latest update for id EMP5487098118986160. We >> also >>> have another collection with an index size of 120 GB and the number of >>> documents is 744620373 but still, there is no slowness in the soft >> commit. >>> >>> When we checked Solr source code we found that time spent in >>> ExitableDirectoryReader.wrap(UninvertingReader.wrap(reader, >>> >> core.getLatestSchema().getUninversionMapper()),SolrQueryTimeoutImpl.getInstance()); >>> this.leafReader = SlowCompositeReaderWrapper.wrap(this.reader); >>> >>> How can we troubleshoot the issue? >> >>
RE: Cancelling an Async operation - Shard split
Dear Community Is there a way to cancel a shard split operation in solr? I couldn’t find any such option in collection/core management API. I see the operation is tracked via zookeeper nodes,will I be able to cancel the operation by clearing these nodes from ZK? Regards Hitendra
Re: Backup from old server and Restore to new server
Scanned this thread, apologies if I missed something, but here's a few thoughts: To get better advice make it clear if you are running Solr in Cloud mode (a.k.a. self managed) or Legacy (a.k.a user managed). Some ways to know which quickly: 1. Is there an associated Zookeeper cluster? If yes, then you are in cloud mode if not then *probably* legacy (there's a way to run zookeeper embedded, but that's not the normal setup). 2. In the admin UI do you see the word 'Cloud' in the left navigation bar? If yes, cloud, if no, legacy *Key concept: Solr is (normally) just a server providing access to an index of your data. It allows you to find a link, or id for a "document" but does not (normally) serve as a repository for your data.* This has some implications: 1. Solr is typically paired with one or more data repositories (database, file system, sharepoint, etc) 2. Solr normally cannot reindex data all by itself. Re-indexing is the process of re-reading the repository, and creating a fresh index. 3. Solr is just an index, and does not manage the process of reading the data from sources (Exceptions like Data import handler[DIH] and streaming expressions exist, but DIH went away in 9.x and these are exceptions not the rule) 4. Typically *something* outside of solr sends documents to solr. Re-indexing is normally the process of re-triggering something to send the documents again. 5. This is unlike a database which contains both the data (the table) and an index (PK/FK/index) of the data. 6. Versus a database, Solr's benefit is that it is an index of the *words* in the text of the document rather than entire string values. Thus (exceptional cases excluded) things you do to or in solr don't "trigger reindexing". I have implied that sometimes solr can be the store for your data, which is technically true. Unfortunately, this is tricky to get right, may negatively impact performance, and results in long term data loss if done wrong, so it's rarely recommended. I hope you haven't inherited this type of problem! Upgrading Solr across a single minor version is often simple, but occasionally requires work. Always read release notes and test the result before going live. Upgrading across major versions is always work. Lucene (and therefore solr) requires that you reindex data with each major version. There are stopgap tools to allow an upgrade of an existing index, but that is a temporary measure that only works for N to N+1 and you are expected to re-index before N+2. - Gus -- http://www.needhamsoftware.com (work) https://a.co/d/b2sZLD9 (my fantasy fiction book)
Re: Cancelling an Async operation - Shard split
Unless you are very experienced and comfortable with solr, do not edit zookeeper nodes directly. Things you should touch generally have support in bin/solr or other provided tools. If you edit the wrong things you can cause all manner of chaos, and even completely ruin the entire cluster, requiring everything to be rebuilt from scratch. I've not tried to stop a shard split before so I don't know if there's a good way to do that, but don't experiment with zookeeper (unless it's a test system you don't care about) -Gus On Thu, Sep 28, 2023 at 11:00 AM Hitendra Talluri wrote: > Dear Community > > Is there a way to cancel a shard split operation in solr? I couldn’t find > any such option in collection/core management API. I see the operation is > tracked via zookeeper nodes,will I be able to cancel the operation by > clearing these nodes from ZK? > > Regards > Hitendra > -- http://www.needhamsoftware.com (work) https://a.co/d/b2sZLD9 (my fantasy fiction book)