Error in Fq parsing while using multiple values in solr 8.7
Hi, We have a filter query in our system "fq=negativeattribute:(citychennai%20citydelhi)", in solr 6.5 it was working fine. solr 6.5 parsed query as negativeattribute:citychennai negativeattribute:citydelhi After upgrading the solr to 8.7, this query broke. It is not working as before solr 8.7 parsed query as "parsed_filter_queries": [ "negativeattribute:citychennai citydelhi", } Schema of negattribute field Further this is working fine for the fields which have only string type mapping. Then the fq field is being applied to every value in parenthesis. --
Re: Error in Fq parsing while using multiple values in solr 8.7
Hi Shawn, Thank you, I also had my suspicion on the sow parameter. but I can't figure out why it is acting differently for analyzed type and non-analyzed. for example, If I give this query to solr 8.7 fq=negativeattribute:(citychennai mcat43120 20mcat43120)&debug=query&fq=mcatid:(43120 26527 43015) It parses both queries as you can see for mcatid field it is working like sow is true. "parsed_filter_queries": [ "negativeattribute:citychennai mcat43120 mcat43120", "mcatid:43120 mcatid:26527 mcatid:43015" ] } Schema of negattribute field < filter class="solr.TrimFilterFactory"/> Schema of mcatid field On Wed, Jul 21, 2021 at 8:42 PM Shawn Heisey wrote: > On 7/20/2021 11:37 PM, Satya Nand wrote: > > We have a filter query in our system > > "fq=negativeattribute:(citychennai%20citydelhi)", in solr 6.5 it was > > working fine. solr 6.5 parsed query as > name="parsed_filter_queries"> negativeattribute:citychennai > > negativeattribute:citydelhi After upgrading the solr to > > 8.7, this query broke. It is not working as before solr 8.7 parsed > > query as "parsed_filter_queries": [ "negativeattribute:citychennai > > citydelhi", } Schema of negattribute field > > The "sow" query parameter (split on whitespace) now defaults to false. > This is intentional. Your analysis chain doesn't split the input into > tokens, so the value is accepted as-is -- with the space. > > It is expected that the query analysis definition will do the splitting > now, not the query parser. > > You can add "sow=true" to the query parameters, either on the request or > in the handler definition, and regain the behavior you're expecting. > But if you actually do intend to have this field be an exact match of > all characters including space, that's probably not the best idea. If > you change the fq to the following, it would also work: > > fq=negativeattribute:(citychennai OR citydelhi) > > Thanks, > Shawn > > --
Big difference in response time on solr 8.7; Optimized vs un Optimized core
Hi, We have recently upgraded solr from version 6.5 to version 8.7. But opposite to our expectations, the response time increased by 40 %. On solr 8.7 the difference between optimized and unoptimized index is also very huge. 350 ms on optimized and 650 ms on unoptimized. The difference is only 5 GB in size in cores of optimized and unoptimized. The segment count in the optimized index is 1 and 20 in the unoptimized index. I wanted to ask, Is this normal behavior on solr 8.7, or was there some setting that we forgot to add? Pleas also tell us how can we reduce the response time in unoptimzed core. *Specifications* We are using master slave architecture, Polling interval is 3 hours RAM- 96 GB CPU-14 Heap-30 GB Index Size-95 GB Segments size-20 Merge Policy : 5 3 --
Re: Big difference in response time on solr 8.7; Optimized vs un Optimized core
Hi Deepak, We actually tried with 128 GB machine, Didn't help in response time. So we moved back to 96GB. On Tue, Aug 3, 2021 at 2:11 PM Deepak Goel wrote: > I am confused a bit about the maths: > > Heap-30 GB & Index Size-95 GB is equal to 125GB. And the RAM is 96GB. > > > > > Deepak > "The greatness of a nation can be judged by the way its animals are treated > - Mahatma Gandhi" > > +91 73500 12833 > deic...@gmail.com > > Facebook: https://www.facebook.com/deicool > LinkedIn: www.linkedin.com/in/deicool > > "Plant a Tree, Go Green" > > Make In India : http://www.makeinindia.com/home > > > On Tue, Aug 3, 2021 at 12:10 PM Satya Nand .invalid> > wrote: > > > Hi, > > > > We have recently upgraded solr from version 6.5 to version 8.7. But > > opposite to our expectations, the response time increased by 40 %. > > > > On solr 8.7 the difference between optimized and unoptimized index is > also > > very huge. 350 ms on optimized and 650 ms on unoptimized. The difference > is > > only 5 GB in size in cores of optimized and unoptimized. The segment > count > > in the optimized index is 1 and 20 in the unoptimized index. > > > > I wanted to ask, Is this normal behavior on solr 8.7, or was there some > > setting that we forgot to add? Pleas also tell us how can we reduce the > > response time in unoptimzed core. > > > > *Specifications* > > We are using master slave architecture, Polling interval is 3 hours > > RAM- 96 GB > > CPU-14 > > Heap-30 GB > > Index Size-95 GB > > Segments size-20 > > Merge Policy : > > > > class="org.apache.solr.index.TieredMergePolicyFactory"> > > 5 3 > mergePolicyFactory> > > > > -- > > > > > --
OutofMemory Error in solr 6.5
Hi, We are facing a strange issue in our solr system. Most of the days it keeps running fine but once or twice in a month, we face OutofMemory on solr servers. We are using Leader-Follower architecture, one Leader and 4 followers. Strangely we get OutofMemory error on all follower servers. Before the OutOfMemory this exception is found on all servers. Aug, 04 2021 15:26:11 org.apache.solr.servlet.HttpSolrCall search-central-prd-solr-temp1 ERROR: null:java.lang.NullPointerException search-central-prd-solr-temp1 at org.apache.solr.search.CollapsingQParserPlugin$OrdScoreCollector.finish(CollapsingQParserPlugin.java:617) search-central-prd-solr-temp1 at org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:240) search-central-prd-solr-temp1 at org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:2027) search-central-prd-solr-temp1 at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1844) search-central-prd-solr-temp1 at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:609) search-central-prd-solr-temp1 at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:547) search-central-prd-solr-temp1 at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:295) search-central-prd-solr-temp1 at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173) search-central-prd-solr-temp1 at org.apache.solr.core.SolrCore.execute(SolrCore.java:2440) search-central-prd-solr-temp1 at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723) search-central-prd-solr-temp1 at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529) search-central-prd-solr-temp1 at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:347) search-central-prd-solr-temp1 at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:298) search-central-prd-solr-temp1 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691) search-central-prd-solr-temp1 at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582) search-central-prd-solr-temp1 at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) search-central-prd-solr-temp1 at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) search-central-prd-solr-temp1 at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226) search-central-prd-solr-temp1 at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) search-central-prd-solr-temp1 at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) search-central-prd-solr-temp1 at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) search-central-prd-solr-temp1 at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) search-central-prd-solr-temp1 at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) search-central-prd-solr-temp1 at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213) search-central-prd-solr-temp1 at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119) search-central-prd-solr-temp1 at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) search-central-prd-solr-temp1 at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335) search-central-prd-solr-temp1 at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) search-central-prd-solr-temp1 at org.eclipse.jetty.server.Server.handle(Server.java:534) search-central-prd-solr-temp1 at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320) search-central-prd-solr-temp1 at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251) search-central-prd-solr-temp1 *Configuration* Index size- 95 GB Heap 30 GB Ram-96 GB Polling Interval - 3 Hours Caching- --
Re: OutofMemory Error in solr 6.5
Hi Dominique, You don't provide information about the number of documents. Anyway, all > your cache size and mostly initial size are big. Cache are stored in JVM > heap. Document count is 101893353. About cache size, most is not always better. Did you make some performance > benchmarks in order to set these values ? We increased cache size in the hope to reduce some response time, We heavily use group queries with 7-8 boost factors. The average response time on this document set is 136 ms. We receive approx 65 requests/second in peak hours. The replication interval is 3 hours. The most strange thing about is that system keeps running for days without any issue, So I believe cache size should not be an issue. If the cache size had been the culprit, the issue would have been frequent. isn't it? On Mon, Aug 9, 2021 at 6:44 PM Dominique Bejean wrote: > Hi, > > You don't provide information about the number of documents. Anyway, all > your cache size and mostly initial size are big. Cache are stored in JVM > heap. > > About cache size, most is not always better. Did you make some performance > benchmarks in order to set these values ? > > Try with the default values, after a few hours check cumulative caches > statistics in order to decide if you need to increase their sizes. The > objective is not to have cumulative_hitratio to 100%. There isn't ideal > value as it is really related to your datas, to the user's queries, to how > you build your queries ... but 70% is a good value. At some point > increasing the size again and again won't increase cumulative_hitratio a > lot as it is a logarithmic curve. > > Check also the heap usage also with your JVM GC logs and a tool like > gceasy.io > > Regards > > Dominique > > > > > Le lun. 9 août 2021 à 07:44, Satya Nand > a écrit : > > > Hi, > > We are facing a strange issue in our solr system. Most of the days it > keeps > > running fine but once or twice in a month, we face OutofMemory on solr > > servers. > > > > We are using Leader-Follower architecture, one Leader and 4 followers. > > Strangely we get OutofMemory error on all follower servers. > > Before the OutOfMemory this exception is found on all servers. > > > > Aug, 04 2021 15:26:11 org.apache.solr.servlet.HttpSolrCall > > search-central-prd-solr-temp1 > > ERROR: null:java.lang.NullPointerException search-central-prd-solr-temp1 > > at > > > > > org.apache.solr.search.CollapsingQParserPlugin$OrdScoreCollector.finish(CollapsingQParserPlugin.java:617) > > search-central-prd-solr-temp1 > > at > > > > > org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:240) > > search-central-prd-solr-temp1 > > at > > > > > org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:2027) > > search-central-prd-solr-temp1 > > at > > > > > org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1844) > > search-central-prd-solr-temp1 > > at > > > org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:609) > > search-central-prd-solr-temp1 > > at > > > > > org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:547) > > search-central-prd-solr-temp1 > > at > > > > > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:295) > > search-central-prd-solr-temp1 > > at > > > > > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173) > > search-central-prd-solr-temp1 > > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2440) > > search-central-prd-solr-temp1 > > at > > org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723) > > search-central-prd-solr-temp1 > > at > org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529) > > search-central-prd-solr-temp1 > > at > > > > > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:347) > > search-central-prd-solr-temp1 > > at > > > > > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:298) > > search-central-prd-solr-temp1 > > at > > > > > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691) > > search-central-prd-solr-temp1 > > at > > > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582) > > search-central-prd-
Re: OutofMemory Error in solr 6.5
-central-prd-solr-temp1 at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:609) search-central-prd-solr-temp1 at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:547) search-central-prd-solr-temp1 at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:295) search-central-prd-solr-temp1 at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173) search-central-prd-solr-temp1 at org.apache.solr.core.SolrCore.execute(SolrCore.java:2440) search-central-prd-solr-temp1 at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723) search-central-prd-solr-temp1 at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529) search-central-prd-solr-temp1 at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:347) search-central-prd-solr-temp1 at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:298) search-central-prd-solr-temp1 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691) search-central-prd-solr-temp1 at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582) search-central-prd-solr-temp1 at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) search-central-prd-solr-temp1 at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) search-central-prd-solr-temp1 at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226) search-central-prd-solr-temp1 at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) search-central-prd-solr-temp1 at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) search-central-prd-solr-temp1 at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) search-central-prd-solr-temp1 at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) search-central-prd-solr-temp1 at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) search-central-prd-solr-temp1 at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213) search-central-prd-solr-temp1 at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119) search-central-prd-solr-temp1 at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) search-central-prd-solr-temp1 at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335) search-central-prd-solr-temp1 at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) search-central-prd-solr-temp1 at org.eclipse.jetty.server.Server.handle(Server.java:534) search-central-prd-solr-temp1 On Tue, Aug 10, 2021 at 6:30 AM Shawn Heisey wrote: > On 8/8/2021 11:43 PM, Satya Nand wrote: > > We are facing a strange issue in our solr system. Most of the days it > keeps > > running fine but once or twice in a month, we face OutofMemory on solr > > servers. > > > > We are using Leader-Follower architecture, one Leader and 4 followers. > > Strangely we get OutofMemory error on all follower servers. > > Before the OutOfMemory this exception is found on all servers. > > Do you have the actual OutOfMemoryError exception? Can we see that? > There are several resources other than heap memory that will result in > OOME if they are exhausted. It's important to be investigating the > correct resource. > > > > autowarmCount="100" /> size="3" > > initialSize="1000" autowarmCount="100" /> > "solr.LRUCache" size="25000" initialSize="512" autowarmCount="512" /> > > If you have five million documents (including those documents that have > been deleted) in a core, then each filterCache entry for that core will > be 625000 bytes, plus some unknown amount of overhead to manage the > entry. Four thousand of them will consume 2.5 billion bytes. If you > have multiple cores each with many documents, the amount of memory > required for the filterCache could get VERY big. > > Until we can see the actual OOME exception, we won't know what resource > you need to investigate. It is frequently NOT memory. > > Thanks, > Shawn > --
Re: OutofMemory Error in solr 6.5
Hi Dominique, Thanks, But I still have one confusion, Please help me with it. Pretty sure the issue is caused by caches size at new searcher warmup time. We use leader-follower architecture with a replication interval of 3 hours. This means every 3 hours we get a commit and the searcher warms up. Right? We have frequent indexing, It isn't possible that we don't get any document indexed in 3 hours period. So the new searcher warms up 7-8 times a day. If the issue was due to a new searcher warm-up, We would have got the issue many times a day. But we get this OutOfMemory Issue once or twice a month or sometimes once in two months, so some combination of events must be triggering it. Further, In solrConfig we have 2 does this mean that at some time up to 3 searchers(1 active and 2 still warming up their caches) might be holding to their caches and this resulting in the issue? Sould we reduce this to 1 ? what impact would it have if we reduced it ? On Wed, Aug 11, 2021 at 2:15 AM Dominique Bejean wrote: > Pretty sure the issue is caused by caches size at new searcher warmup time. > > Dominique > > > Le mar. 10 août 2021 à 09:07, Satya Nand a > écrit : > >> Hi Dominique, >> >> You don't provide information about the number of documents. Anyway, all >>> your cache size and mostly initial size are big. Cache are stored in JVM >>> heap. >> >> Document count is 101893353. >> >> About cache size, most is not always better. Did you make some performance >>> benchmarks in order to set these values ? >> >> We increased cache size in the hope to reduce some response time, We >> heavily use group queries with 7-8 boost factors. The average response time >> on this document set is 136 ms. We receive approx 65 requests/second in >> peak hours. The replication interval is 3 hours. >> >> The most strange thing about is that system keeps running for days >> without any issue, So I believe cache size should not be an issue. If the >> cache size had been the culprit, the issue would have been frequent. isn't >> it? >> >> >> >> On Mon, Aug 9, 2021 at 6:44 PM Dominique Bejean < >> dominique.bej...@eolya.fr> wrote: >> >>> Hi, >>> >>> You don't provide information about the number of documents. Anyway, all >>> your cache size and mostly initial size are big. Cache are stored in JVM >>> heap. >>> >>> About cache size, most is not always better. Did you make some >>> performance >>> benchmarks in order to set these values ? >>> >>> Try with the default values, after a few hours check cumulative caches >>> statistics in order to decide if you need to increase their sizes. The >>> objective is not to have cumulative_hitratio to 100%. There isn't ideal >>> value as it is really related to your datas, to the user's queries, to >>> how >>> you build your queries ... but 70% is a good value. At some point >>> increasing the size again and again won't increase cumulative_hitratio a >>> lot as it is a logarithmic curve. >>> >>> Check also the heap usage also with your JVM GC logs and a tool like >>> gceasy.io >>> >>> Regards >>> >>> Dominique >>> >>> >>> >>> >>> Le lun. 9 août 2021 à 07:44, Satya Nand >> .invalid> >>> a écrit : >>> >>> > Hi, >>> > We are facing a strange issue in our solr system. Most of the days it >>> keeps >>> > running fine but once or twice in a month, we face OutofMemory on solr >>> > servers. >>> > >>> > We are using Leader-Follower architecture, one Leader and 4 followers. >>> > Strangely we get OutofMemory error on all follower servers. >>> > Before the OutOfMemory this exception is found on all servers. >>> > >>> > Aug, 04 2021 15:26:11 org.apache.solr.servlet.HttpSolrCall >>> > search-central-prd-solr-temp1 >>> > ERROR: null:java.lang.NullPointerException >>> search-central-prd-solr-temp1 >>> > at >>> > >>> > >>> org.apache.solr.search.CollapsingQParserPlugin$OrdScoreCollector.finish(CollapsingQParserPlugin.java:617) >>> > search-central-prd-solr-temp1 >>> > at >>> > >>> > >>> org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:240) >>> > search-central-prd-solr-temp1 >>> > at >>> > >>> > >>> org.apach
Re: OutofMemory Error in solr 6.5
Hi Shawn, Thanks for explaining it so well. We will work on reducing the filter cache size and auto warm count. Though I have one question. If your configured 4000 entry filterCache were to actually fill up, it > would require nearly 51 billion bytes, and that's just for the one core > with 101 million documents. This is much larger than the 30GB heap you > have specified ... I am betting that the filterCache is the reason > you're hitting OOME. As you can see from the below screenshots the filter cache is almost full and the heap is approx 18-20 GB. I think this means heap is not actually taking 51 GB of space. Otherwise, the issue would have been very frequent if the full cache had been taking ~50 GB of space. I also believed the solr uses some compressed data structures to accumulate its cache, That' how it is able to store the cache in less memory. Isn't it? Also, the issue is not very frequent. It comes once or twice a month, Where all follower servers stop working at the same time due to OutOfMemory error. *Filter Cache statics as of 10:08 IST* [image: image.png] *Heap Usages* [image: image.png] - On Wed, Aug 11, 2021 at 4:12 AM Shawn Heisey wrote: > On 8/10/2021 1:06 AM, Satya Nand wrote: > > Document count is 101893353. > > The OOME exception confirms that we are dealing with heap memory. That > means we won't have to look into the other resource types that can cause > OOME. > > With that document count, each filterCache entry is 12736670 bytes, plus > some small number of bytes for java object overhead. That's 12.7 > million bytes. > > If your configured 4000 entry filterCache were to actually fill up, it > would require nearly 51 billion bytes, and that's just for the one core > with 101 million documents. This is much larger than the 30GB heap you > have specified ... I am betting that the filterCache is the reason > you're hitting OOME. > > You need to dramatically reduce the size of your filterCache. Start > with 256 and see what that gets you. Solr ships with a size of 512. > Also, see what you can do about making it so that there is a lot of > re-use possible with queries that you put in the fq parameter. It's > better to have several fq parameters rather than one parameter with a > lot of AND clauses -- much more chance of filter re-use. > > I notice that you have autowarmCount set to 100 on two caches. (The > autowarmCount on the documentCache, which you have set to 512, won't be > used -- that cache cannot be warmed directly. It is indirectly warmed > when the other caches are warmed.) This means that every time you issue > a commit that opens a new searcher, Solr will execute up to 200 queries > as part of the cache warming. This can make the warming take a VERY > long time. Consider reducing autowarmCount. It's not causing your OOME > problems, but it might be making commits take a very long time. > > Thanks, > Shawn > --
Re: Big difference in response time on solr 8.7; Optimized vs un Optimized core
Thanks, Deepak, We will try doing this. But still, I am wondering what led to the increase in response time this much from solr 6.5 to solr 8.7, keeping everything same. We are facing an increase of 100-150 ms. On Wed, Aug 11, 2021 at 11:55 AM Deepak Goel wrote: > If I were you, then I would stick to the 128 GB machine. And then look at > other parameters to tune... > > > Deepak > "The greatness of a nation can be judged by the way its animals are treated > - Mahatma Gandhi" > > +91 73500 12833 > deic...@gmail.com > > Facebook: https://www.facebook.com/deicool > LinkedIn: www.linkedin.com/in/deicool > > "Plant a Tree, Go Green" > > Make In India : http://www.makeinindia.com/home > > > On Tue, Aug 3, 2021 at 3:25 PM Satya Nand .invalid> > wrote: > > > Hi Deepak, > > > > We actually tried with 128 GB machine, Didn't help in response time. So > we > > moved back to 96GB. > > > > On Tue, Aug 3, 2021 at 2:11 PM Deepak Goel wrote: > > > > > I am confused a bit about the maths: > > > > > > Heap-30 GB & Index Size-95 GB is equal to 125GB. And the RAM is 96GB. > > > > > > > > > > > > > > > Deepak > > > "The greatness of a nation can be judged by the way its animals are > > treated > > > - Mahatma Gandhi" > > > > > > +91 73500 12833 > > > deic...@gmail.com > > > > > > Facebook: https://www.facebook.com/deicool > > > LinkedIn: www.linkedin.com/in/deicool > > > > > > "Plant a Tree, Go Green" > > > > > > Make In India : http://www.makeinindia.com/home > > > > > > > > > On Tue, Aug 3, 2021 at 12:10 PM Satya Nand > > .invalid> > > > wrote: > > > > > > > Hi, > > > > > > > > We have recently upgraded solr from version 6.5 to version 8.7. But > > > > opposite to our expectations, the response time increased by 40 %. > > > > > > > > On solr 8.7 the difference between optimized and unoptimized index is > > > also > > > > very huge. 350 ms on optimized and 650 ms on unoptimized. The > > difference > > > is > > > > only 5 GB in size in cores of optimized and unoptimized. The segment > > > count > > > > in the optimized index is 1 and 20 in the unoptimized index. > > > > > > > > I wanted to ask, Is this normal behavior on solr 8.7, or was there > some > > > > setting that we forgot to add? Pleas also tell us how can we reduce > the > > > > response time in unoptimzed core. > > > > > > > > *Specifications* > > > > We are using master slave architecture, Polling interval is 3 hours > > > > RAM- 96 GB > > > > CPU-14 > > > > Heap-30 GB > > > > Index Size-95 GB > > > > Segments size-20 > > > > Merge Policy : > > > > > > > > > > class="org.apache.solr.index.TieredMergePolicyFactory"> > > > > 5 name="segmentsPerTier">3 > > > > > mergePolicyFactory> > > > > > > > > -- > > > > > > > > > > > > > > > -- > > > > > --
Re: Big difference in response time on solr 8.7; Optimized vs un Optimized core
Hi Deepak, Heap Size index size(Almost Same) schema solr config We have only done one change, Earlier we using synonym_edismax parser. As this parser was not available for solr 8.7 we replaced it with edismax + synonym graph filter factory to handle multiword synonym. Also, On solr 8.7 the difference between optimized and unoptimized index is also very huge. 180+ ms on optimized and 350+ ms on unoptimized. The difference is only 5 GB in size in cores of optimized and unoptimized. The segment count in the optimized index is 1 and 20 in the unoptimized index. On Wed, Aug 11, 2021 at 12:39 PM Deepak Goel wrote: > You will have to elaborate a bit on: "keeping everything same" > > > Deepak > "The greatness of a nation can be judged by the way its animals are treated > - Mahatma Gandhi" > > +91 73500 12833 > deic...@gmail.com > > Facebook: https://www.facebook.com/deicool > LinkedIn: www.linkedin.com/in/deicool > > "Plant a Tree, Go Green" > > Make In India : http://www.makeinindia.com/home > > > On Wed, Aug 11, 2021 at 12:38 PM Satya Nand > wrote: > > > Thanks, Deepak, We will try doing this. > > > > But still, I am wondering what led to the increase in response time this > > much from solr 6.5 to solr 8.7, keeping everything same. > > We are facing an increase of 100-150 ms. > > > > > > On Wed, Aug 11, 2021 at 11:55 AM Deepak Goel wrote: > > > > > If I were you, then I would stick to the 128 GB machine. And then look > at > > > other parameters to tune... > > > > > > > > > Deepak > > > "The greatness of a nation can be judged by the way its animals are > > treated > > > - Mahatma Gandhi" > > > > > > +91 73500 12833 > > > deic...@gmail.com > > > > > > Facebook: https://www.facebook.com/deicool > > > LinkedIn: www.linkedin.com/in/deicool > > > > > > "Plant a Tree, Go Green" > > > > > > Make In India : http://www.makeinindia.com/home > > > > > > > > > On Tue, Aug 3, 2021 at 3:25 PM Satya Nand > > .invalid> > > > wrote: > > > > > > > Hi Deepak, > > > > > > > > We actually tried with 128 GB machine, Didn't help in response time. > So > > > we > > > > moved back to 96GB. > > > > > > > > On Tue, Aug 3, 2021 at 2:11 PM Deepak Goel > wrote: > > > > > > > > > I am confused a bit about the maths: > > > > > > > > > > Heap-30 GB & Index Size-95 GB is equal to 125GB. And the RAM is > 96GB. > > > > > > > > > > > > > > > > > > > > > > > > > Deepak > > > > > "The greatness of a nation can be judged by the way its animals are > > > > treated > > > > > - Mahatma Gandhi" > > > > > > > > > > +91 73500 12833 > > > > > deic...@gmail.com > > > > > > > > > > Facebook: https://www.facebook.com/deicool > > > > > LinkedIn: www.linkedin.com/in/deicool > > > > > > > > > > "Plant a Tree, Go Green" > > > > > > > > > > Make In India : http://www.makeinindia.com/home > > > > > > > > > > > > > > > On Tue, Aug 3, 2021 at 12:10 PM Satya Nand < > satya.n...@indiamart.com > > > > > .invalid> > > > > > wrote: > > > > > > > > > > > Hi, > > > > > > > > > > > > We have recently upgraded solr from version 6.5 to version 8.7. > > But > > > > > > opposite to our expectations, the response time increased by 40 > %. > > > > > > > > > > > > On solr 8.7 the difference between optimized and unoptimized > index > > is > > > > > also > > > > > > very huge. 350 ms on optimized and 650 ms on unoptimized. The > > > > difference > > > > > is > > > > > > only 5 GB in size in cores of optimized and unoptimized. The > > segment > > > > > count > > > > > > in the optimized index is 1 and 20 in the unoptimized index. > > > > > > > > > > > > I wanted to ask, Is this normal behavior on solr 8.7, or was > there > > > some > > > > > > setting that we forgot to add? Pleas also tell us how can we > reduce > > > the > > > > > > response time in unoptimzed core. > > > > > > > > > > > > *Specifications* > > > > > > We are using master slave architecture, Polling interval is 3 > hours > > > > > > RAM- 96 GB > > > > > > CPU-14 > > > > > > Heap-30 GB > > > > > > Index Size-95 GB > > > > > > Segments size-20 > > > > > > Merge Policy : > > > > > > > > > > > > > > > > class="org.apache.solr.index.TieredMergePolicyFactory"> > > > > > > 5 > > name="segmentsPerTier">3 > > > > > > > > > mergePolicyFactory> > > > > > > > > > > > > -- > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > > -- > > > > > --
Re: OutofMemory Error in solr 6.5
Hi Shawn, Please find the images. *Filter cache stats:* https://drive.google.com/file/d/19MHEzi9m3KS4s-M86BKFiwmnGkMh3DGx/view?usp=sharing *Heap stats* https://drive.google.com/file/d/1Q62ea-nFh9UjbcVcBJ39AECWym6nk2Yg/view?usp=sharing I'm curious whether the 101 million document count is for one shard > replica or for the whole collection. How many documents are in all the > shard replicas handled by one Solr instance? We are not using solr cloud. We are using standalone solr with Master-slave architecture. 101 million documents are in one core. On Wed, Aug 11, 2021 at 5:20 PM Shawn Heisey wrote: > On 8/10/2021 11:17 PM, Satya Nand wrote: > > Thanks for explaining it so well. We will work on reducing the filter > > cache size and auto warm count. > > > > Though I have one question. > > > > If your configured 4000 entry filterCache were to actually fill up, > it > > would require nearly 51 billion bytes, and that's just for the one > > core > > with 101 million documents. This is much larger than the 30GB > > heap you > > have specified ... I am betting that the filterCache is the reason > > you're hitting OOME. > > > > > > As you can see from the below screenshots the filter cache is almost > > full and the heap is approx 18-20 GB. I think this means heap is not > > actually taking 51 GB of space. Otherwise, the issue would have been > > very frequent if the full cache had been taking ~50 GB of space. I > > also believed the solr uses some compressed data structures to > > accumulate its cache, That' how it is able to store the cache in less > > memory. Isn't it? > > > > Also, the issue is not very frequent. It comes once or twice a month, > > Where all follower servers stop working at the same time due to > > OutOfMemory error. > > We can't see any of the images. The mailing list software stripped > them. Most attachments do not come through -- you'll need to find a > file sharing website and give us links. Dropbox is a good choice, and > there are others. > > The cache may not be getting full, but each entry is over 12 megabytes > in size, so it will not need to be full to cause problems. It does not > get compressed. Solr (actually Lucene) does use compression in the > index file formats. It would be possible to compress the bitmap for a > filterCache entry, but that would slow things down when there is a cache > hit. I have no idea how much it would slow things down. > > The cache warming probably isn't the problem. That's only going to > (temporarily) add 100 new entries to a new cache, then the old cache > will be gone. If the filterCache is indeed the major memory usage, it's > probably queries that cause it to get large. > > I'm curious whether the 101 million document count is for one shard > replica or for the whole collection. How many documents are in all the > shard replicas handled by one Solr instance? > > Thanks, > Shawn > > > --
Re: OutofMemory Error in solr 6.5
Thanks, Shawn, This makes sense. Filter queries with high hit counts could be the trigger for out-of-memory, That's why it is so infrequent. We will try to relook filter queries and further try reducing filter cache size. one question though, > There is an alternate format for filterCache entries, that just lists > the IDs of the matching documents. This only gets used when the > hitcount for the filter is low. Does this alternate format use different data structures to store the document ids for filters with low document count, Other than the bitmap? means the size constraint(filter cache size) would apply only on bitmap or this alternate structure too or their sum? On Wed, 11 Aug, 2021, 6:50 pm Shawn Heisey, wrote: > On 8/11/2021 6:04 AM, Satya Nand wrote: > > *Filter cache stats:* > > > https://drive.google.com/file/d/19MHEzi9m3KS4s-M86BKFiwmnGkMh3DGx/view?usp=sharing > > > This shows the current size as 3912, almost full. > > There is an alternate format for filterCache entries, that just lists > the IDs of the matching documents. This only gets used when the > hitcount for the filter is low. I do not know what threshold it uses to > decide that the hitcount is low enough to use the alternate format, and > I do not know where in the code to look for the answer. This is > probably why you can have 3912 entries in the cache without blowing the > heap. > > I bet that when the heap gets blown, the filter queries Solr receives > are such that they cannot use the alternate format, and thus require the > full 12.7 million bytes. Get enough of those, and you're going to need > more heap than 30GB. I bet that if you set the heap to 31G, the OOMEs > would occur a little less frequently. Note that if you set the heap to > 32G, you actually have less memory available than if you set it to 31G > -- At 32GB, Java must switch from 32 bit pointers to 64 bit pointers. > Solr creates a LOT of objects on the heap, so that difference adds up. > > Discussion item for those with an interest in the low-level code: What > kind of performance impact would it cause to use a filter bitmap > compressed with run-length encoding? Would that happen at the Lucene > level rather than the Solr level? > > To fully solve this issue, you may need to re-engineer your queries so > that fq values are highly reusable, and non-reusable filters are added > to the main query. Then you would not need a very large cache to obtain > a good hit ratio. > > Thanks, > Shawn > > --
Re: High CPU utilisation on Solr-8.11.0
Hi Michael, 1. set `pf=` (phrase field empty string), disabling implicit phrase query > building. This would help give a sense of whether phrase queries are > involved in the performance issues you're seeing. We are also in the process of moving from standalone 6.6 to 8.7 Solr cloud, We also noticed a huge response time increase (91 ms To 170 ms +). We tried applying the tweak of disabling the pf field and response time was back to normal. So somehow pf was responsible for increased response time. We are using both query-time multi-term synonyms and WordDelimiter[Graph]Filter. What should we do next from here as we can't disable the pf field? Cluster Configuration: 3 Solr Nodes: 5 CPU, 42 GB Ram (Each) 3 Zookeeper Nodes: 1 CPU, 2 GB Ram (Each) 3 Shards: 42m Documents, 42 GB (Each) Heap: 8 GB There are no deleted documents in the cluster and no updates going on. We are trying to match the performance first. On Sat, Mar 26, 2022 at 9:42 PM Michael Gibney wrote: > Are you using query-time multi-term synonyms or WordDelimiter[Graph]Filter? > -- these can trigger "graph phrase" queries, which are handled _quite_ > differently in Solr 8.11 vs 6.5 (and although unlikely to directly cause > the performance issues you're observing, might well explain the performance > discrepancy). If you're _not_ using either of those, then the rest of this > message is likely irrelevant. > > One thing to possibly keep an eye out for (in addition to gathering more > evidence, as Mike Drob suggests): 6.5 started using span queries for "graph > phrase" queries (LUCENE-7699), but the resulting phrase queries were > completely ignored in Solr (bug) until 7.6 (SOLR-12243). Completely > ignoring complex phrase queries did however greatly reduce latency and CPU > load on 6.5! > > 7.6 started paying attention to these queries again (SOLR-12243), but also > went back to "fully-enumerated" combinatoric approach to phrase queries > when `ps` (phrase slop) is greater than 0 (LUCENE-8531). > > Some parameters you could tweak, assuming you're using edismax: > 1. set `pf=` (phrase field empty string), disabling implicit phrase query > building. This would help give a sense of whether phrase queries are > involved in the performance issues you're seeing. > 2. set `ps=0` (phrase slop 0), this should allow span queries to be built, > which should generally be more efficient than analogous non-span-query > approach (basically this would make the change introduced by LUCENE-8531 > irrelevant); tangentially: the special case building span queries for > `ps=0` is removed as of Lucene 9.0 (will be removed as of Solr 9.0 -- not > directly relevant to this issue though). > > Michael > > On Sat, Mar 26, 2022 at 8:26 AM Mike Drob wrote: > > > Can you provide more details on what they CPU time is spent on? Maybe > look > > at some JFR profiles or collect several jstacks to see where they > > bottlenecks are. > > > > On Sat, Mar 26, 2022 at 3:49 AM Modassar Ather > > wrote: > > > > > Hi, > > > > > > We are trying to migrate to Solr-8.11.0 from Solr-6.5.1. Following are > > the > > > details of Solr installation. > > > > > > Server : EC2 instance with 32 CPUs and 521 GB > > > Storage : EBS Volume. General Purpose SSD (gp3) with 3000/5000 IOPS > > > Ubuntu :Ubuntu 20.04.3 LTS > > > Java : openjdk 11.0.14 > > > SolrCloud : 12 shards having a total 4+ TB index. Each node has a 30GB > > max > > > memory limit. > > > GC setting on Solr : G1GC > > > Solr query timeout : 5 minutes > > > > > > During testing we observed a high CPU utilisation and few of the > queries > > > with wildcard queries are timing out. These queries are getting > executed > > > completely on Solr-6.5.1. > > > After tuning a few of the parameters of GC settings the CPU utilisation > > > came down but it is still high when compared with Solr-6.5.1 and some > > > queries with wildcard queries are still failing. > > > > > > Kindly provide your suggestions. > > > > > > Thanks, > > > Modassar > > > > > >
Re: High CPU utilisation on Solr-8.11.0
Thanks, Michael. I missed that mail thread among all responses. I will check that too. On Thu, May 5, 2022 at 6:26 PM Michael Gibney wrote: > Did you yet dig through the mailing list thread link that I posted earlier > in this thread? It explains in more depth, suggests a number of possible > mitigations, and has a bunch of links to jira issues that provide extra > context. Off the cuff, I'd say that setting `enableGraphQueries=false` may > be most immediately helpful in terms of restoring performance. > > (As an aside: from my perspective though, even if you can restore > performance, it would be at the expense of nuances of functionality. Longer > term I'd really like to help solve this properly, involving some > combination of the issues linked to in the above thread ...) > > Michael > > On Thu, May 5, 2022 at 3:01 AM Satya Nand .invalid> > wrote: > > > Hi Michael, > > > > > > 1. set `pf=` (phrase field empty string), disabling implicit phrase query > > > building. This would help give a sense of whether phrase queries are > > > involved in the performance issues you're seeing. > > > > > > We are also in the process of moving from standalone 6.6 to 8.7 Solr > cloud, > > We also noticed a huge response time increase (91 ms To 170 ms +). > > > > We tried applying the tweak of disabling the pf field and response time > was > > back to normal. So somehow pf was responsible for increased response > time. > > > > We are using both query-time multi-term synonyms and > > WordDelimiter[Graph]Filter. > > > > What should we do next from here as we can't disable the pf field? > > > > Cluster Configuration: > > > > 3 Solr Nodes: 5 CPU, 42 GB Ram (Each) > > 3 Zookeeper Nodes: 1 CPU, 2 GB Ram (Each) > > 3 Shards: 42m Documents, 42 GB (Each) > > Heap: 8 GB > > > > > > There are no deleted documents in the cluster and no updates going on. We > > are trying to match the performance first. > > > > > > > > > > > > > > > > On Sat, Mar 26, 2022 at 9:42 PM Michael Gibney < > mich...@michaelgibney.net> > > wrote: > > > > > Are you using query-time multi-term synonyms or > > WordDelimiter[Graph]Filter? > > > -- these can trigger "graph phrase" queries, which are handled _quite_ > > > differently in Solr 8.11 vs 6.5 (and although unlikely to directly > cause > > > the performance issues you're observing, might well explain the > > performance > > > discrepancy). If you're _not_ using either of those, then the rest of > > this > > > message is likely irrelevant. > > > > > > One thing to possibly keep an eye out for (in addition to gathering > more > > > evidence, as Mike Drob suggests): 6.5 started using span queries for > > "graph > > > phrase" queries (LUCENE-7699), but the resulting phrase queries were > > > completely ignored in Solr (bug) until 7.6 (SOLR-12243). Completely > > > ignoring complex phrase queries did however greatly reduce latency and > > CPU > > > load on 6.5! > > > > > > 7.6 started paying attention to these queries again (SOLR-12243), but > > also > > > went back to "fully-enumerated" combinatoric approach to phrase queries > > > when `ps` (phrase slop) is greater than 0 (LUCENE-8531). > > > > > > Some parameters you could tweak, assuming you're using edismax: > > > 1. set `pf=` (phrase field empty string), disabling implicit phrase > query > > > building. This would help give a sense of whether phrase queries are > > > involved in the performance issues you're seeing. > > > 2. set `ps=0` (phrase slop 0), this should allow span queries to be > > built, > > > which should generally be more efficient than analogous non-span-query > > > approach (basically this would make the change introduced by > LUCENE-8531 > > > irrelevant); tangentially: the special case building span queries for > > > `ps=0` is removed as of Lucene 9.0 (will be removed as of Solr 9.0 -- > not > > > directly relevant to this issue though). > > > > > > Michael > > > > > > On Sat, Mar 26, 2022 at 8:26 AM Mike Drob wrote: > > > > > > > Can you provide more details on what they CPU time is spent on? Maybe > > > look > > > > at some JFR profiles or collect several jstacks to see where they > > > > bottlenecks are. > > > > > > > > On Sat, Mar 26,
Delete by Id in solr cloud
Hi, I have an 8 shards collection, where I am using *compositeId* routing with *router.field *(a field named parentglUsrId). The unique Id of the collection is a different field *displayid*. I am trying a delete by id operation where I pass a list of displayids to delete. I observed that no documents are being deleted. when I checked the logs I found that the deletion request for an Id might not go to the correct shard and perform a request on some other shard that was not hosting this Id. This might be due to solr trying to find the shard based on the hash of displayid but my sharding is done on the basis of parentglUsrId. is there anything I am missing? Because it seems like a simple operation. what do I need to do to broadcast a delete by id request to all shards so relevant id can be deleted on each shard?
Re: Delete by Id in solr cloud
Hi Radu, I am using solrj for executing the query. I couldn't find any function with accepts additional parameters like routing, shards, solr Params etc. I also tried delete by query instead of deleteById, But it is very slow. https://solr.apache.org/docs/8_1_0/solr-solrj/org/apache/solr/client/solrj/impl/CloudSolrClient.html deleteById <https://solr.apache.org/docs/7_3_1/solr-solrj/org/apache/solr/client/solrj/SolrClient.html#deleteById-java.lang.String-java.util.List-int-> (String <https://docs.oracle.com/javase/8/docs/api/java/lang/String.html?is-external=true> collection, List <https://docs.oracle.com/javase/8/docs/api/java/util/List.html?is-external=true> https://docs.oracle.com/javase/8/docs/api/java/lang/String.html?is-external=true>> ids, int commitWithinMs) On Tue, Jun 28, 2022 at 12:58 PM Radu Gheorghe wrote: > Hi Satya, > > I didn't try it, but does it work if you add "shards=shard1,shard2..." to > the request? > > Worst case scenario, if you have the address of each shard (you can get it > from Zookeeper), you can run the delete command N times, one hitting each > shard address. > > Best regards, > Radu > -- > Elasticsearch/OpenSearch & Solr Consulting, Production Support & Training > Sematext Cloud - Full Stack Observability > http://sematext.com/ > > > On Tue, Jun 28, 2022 at 7:55 AM Satya Nand .invalid> > wrote: > > > Hi, > > > > I have an 8 shards collection, where I am using *compositeId* routing > > with *router.field > > *(a field named parentglUsrId). The unique Id of the collection is a > > different field *displayid*. > > > > I am trying a delete by id operation where I pass a list of displayids to > > delete. I observed that no documents are being deleted. when I checked > the > > logs I found that the deletion request for an Id might not go to the > > correct shard and perform a request on some other shard that was not > > hosting this Id. This might be due to solr trying to find the shard based > > on the hash of displayid but my sharding is done on the basis of > > parentglUsrId. > > > > > > is there anything I am missing? Because it seems like a simple operation. > > what do I need to do to broadcast a delete by id request to all shards so > > relevant id can be deleted on each shard? > > >
Re: Delete by Id in solr cloud
Thanks, Peter, I am checking that, also UpdateRequest class seems to have methods that take routes as input. I will see if it helps. On Tue, Jun 28, 2022 at 3:19 PM Peter Lancaster < peter.lancas...@findmypast.com> wrote: > Hi Satya, > > I think you would need to use a HttpSolrClient that uses the url of the > shard where the record exists. > > Regards, > Peter. > > -----Original Message- > From: Satya Nand > Sent: 28 June 2022 10:43 > To: users@solr.apache.org > Subject: Re: Delete by Id in solr cloud > > EXTERNAL SENDER: Do not click any links or open any attachments unless you > trust the sender and know the content is safe. > > > Hi Radu, > > I am using solrj for executing the query. I couldn't find any function > with accepts additional parameters like routing, shards, solr Params etc. > > I also tried delete by query instead of deleteById, But it is very slow. > > > https://gbr01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsolr.apache.org%2Fdocs%2F8_1_0%2Fsolr-solrj%2Forg%2Fapache%2Fsolr%2Fclient%2Fsolrj%2Fimpl%2FCloudSolrClient.html&data=05%7C01%7Cpeter.lancaster%40findmypast.com%7C52e71d1ca9294234c62808da58eaa4a0%7C75e41e0807c2445db397039b2b54c244%7C0%7C0%7C637920062049080011%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=4d7N5LCpx8TXnEv7GW%2BN2TmoE8YvHa0tgr4c%2FamgOBw%3D&reserved=0 > deleteById > < > https://gbr01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsolr.apache.org%2Fdocs%2F7_3_1%2Fsolr-solrj%2Forg%2Fapache%2Fsolr%2Fclient%2Fsolrj%2FSolrClient.html%23deleteById-java.lang.String-java.util.List-int-&data=05%7C01%7Cpeter.lancaster%40findmypast.com%7C52e71d1ca9294234c62808da58eaa4a0%7C75e41e0807c2445db397039b2b54c244%7C0%7C0%7C637920062049080011%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=MdBKsoMlbTqUjx5xzUny1Hrop0La2cwkg6cVZgZ76Es%3D&reserved=0 > > > (String > < > https://gbr01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.oracle.com%2Fjavase%2F8%2Fdocs%2Fapi%2Fjava%2Flang%2FString.html%3Fis-external%3Dtrue&data=05%7C01%7Cpeter.lancaster%40findmypast.com%7C52e71d1ca9294234c62808da58eaa4a0%7C75e41e0807c2445db397039b2b54c244%7C0%7C0%7C637920062049080011%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=3gPAaYNOQvAkYD8coSuGjm28gau5i3lEJabT4Kqu%2BCk%3D&reserved=0 > > > collection, List > < > https://gbr01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.oracle.com%2Fjavase%2F8%2Fdocs%2Fapi%2Fjava%2Futil%2FList.html%3Fis-external%3Dtrue&data=05%7C01%7Cpeter.lancaster%40findmypast.com%7C52e71d1ca9294234c62808da58eaa4a0%7C75e41e0807c2445db397039b2b54c244%7C0%7C0%7C637920062049080011%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=bUQ0Fe0pPkP2kFeRy%2BLg%2FuTIBSEM1HVQdk4EEAdQYCQ%3D&reserved=0 > > > < > https://gbr01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.oracle.com%2Fjavase%2F8%2Fdocs%2Fapi%2Fjava%2Flang%2FString.html%3Fis-external%3Dtrue&data=05%7C01%7Cpeter.lancaster%40findmypast.com%7C52e71d1ca9294234c62808da58eaa4a0%7C75e41e0807c2445db397039b2b54c244%7C0%7C0%7C637920062049080011%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=3gPAaYNOQvAkYD8coSuGjm28gau5i3lEJabT4Kqu%2BCk%3D&reserved=0 > >> > ids, > int commitWithinMs) > > > > > On Tue, Jun 28, 2022 at 12:58 PM Radu Gheorghe > > wrote: > > > Hi Satya, > > > > I didn't try it, but does it work if you add "shards=shard1,shard2..." > > to the request? > > > > Worst case scenario, if you have the address of each shard (you can > > get it from Zookeeper), you can run the delete command N times, one > > hitting each shard address. > > > > Best regards, > > Radu > > -- > > Elasticsearch/OpenSearch & Solr Consulting, Production Support & > > Training Sematext Cloud - Full Stack Observability > > https://gbr01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fsemat > > ext.com%2F&data=05%7C01%7Cpeter.lancaster%40findmypast.com%7C52e71 > > d1ca9294234c62808da58eaa4a0%7C75e41e0807c2445db397039b2b54c244%7C0%7C0 > > %7C637920062049080011%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQ > > IjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=5 > > T28n9NppzIpUi9MaWeF1ZYcQuh%2FreGX2iVvsDczleI%3D&reserved=0 > > > > > > On Tue, Jun 28, 2022 at 7:55 AM Satya Nand > .invalid> > > wrote: > > > > > Hi, > > > > &g
Re: Delete by Id in solr cloud
Hi Peter, I have tried using httpsolrclient. but the receiving shard also forwards the request to some other shard. Is there any way we can achieve this solr 8.7 ? On Thu, Jun 30, 2022 at 3:41 AM r ohara wrote: > Hi Satya, > I think it's a bug with using compositeId. We had the same issue, and had > to use deleteByQuery instead, but like you said, it's much slower. We're > using solr 8.11 > > On Tue, Jun 28, 2022 at 4:59 AM Satya Nand .invalid> > wrote: > > > Thanks, Peter, > > I am checking that, also UpdateRequest class seems to have methods that > > take routes as input. I will see if it helps. > > > > On Tue, Jun 28, 2022 at 3:19 PM Peter Lancaster < > > peter.lancas...@findmypast.com> wrote: > > > > > Hi Satya, > > > > > > I think you would need to use a HttpSolrClient that uses the url of the > > > shard where the record exists. > > > > > > Regards, > > > Peter. > > > > > > -Original Message- > > > From: Satya Nand > > > Sent: 28 June 2022 10:43 > > > To: users@solr.apache.org > > > Subject: Re: Delete by Id in solr cloud > > > > > > EXTERNAL SENDER: Do not click any links or open any attachments unless > > you > > > trust the sender and know the content is safe. > > > > > > > > > Hi Radu, > > > > > > I am using solrj for executing the query. I couldn't find any function > > > with accepts additional parameters like routing, shards, solr Params > etc. > > > > > > I also tried delete by query instead of deleteById, But it is very > slow. > > > > > > > > > > > > https://gbr01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsolr.apache.org%2Fdocs%2F8_1_0%2Fsolr-solrj%2Forg%2Fapache%2Fsolr%2Fclient%2Fsolrj%2Fimpl%2FCloudSolrClient.html&data=05%7C01%7Cpeter.lancaster%40findmypast.com%7C52e71d1ca9294234c62808da58eaa4a0%7C75e41e0807c2445db397039b2b54c244%7C0%7C0%7C637920062049080011%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=4d7N5LCpx8TXnEv7GW%2BN2TmoE8YvHa0tgr4c%2FamgOBw%3D&reserved=0 > > > deleteById > > > < > > > > > > https://gbr01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsolr.apache.org%2Fdocs%2F7_3_1%2Fsolr-solrj%2Forg%2Fapache%2Fsolr%2Fclient%2Fsolrj%2FSolrClient.html%23deleteById-java.lang.String-java.util.List-int-&data=05%7C01%7Cpeter.lancaster%40findmypast.com%7C52e71d1ca9294234c62808da58eaa4a0%7C75e41e0807c2445db397039b2b54c244%7C0%7C0%7C637920062049080011%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=MdBKsoMlbTqUjx5xzUny1Hrop0La2cwkg6cVZgZ76Es%3D&reserved=0 > > > > > > > (String > > > < > > > > > > https://gbr01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.oracle.com%2Fjavase%2F8%2Fdocs%2Fapi%2Fjava%2Flang%2FString.html%3Fis-external%3Dtrue&data=05%7C01%7Cpeter.lancaster%40findmypast.com%7C52e71d1ca9294234c62808da58eaa4a0%7C75e41e0807c2445db397039b2b54c244%7C0%7C0%7C637920062049080011%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=3gPAaYNOQvAkYD8coSuGjm28gau5i3lEJabT4Kqu%2BCk%3D&reserved=0 > > > > > > > collection, List > > > < > > > > > > https://gbr01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.oracle.com%2Fjavase%2F8%2Fdocs%2Fapi%2Fjava%2Futil%2FList.html%3Fis-external%3Dtrue&data=05%7C01%7Cpeter.lancaster%40findmypast.com%7C52e71d1ca9294234c62808da58eaa4a0%7C75e41e0807c2445db397039b2b54c244%7C0%7C0%7C637920062049080011%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=bUQ0Fe0pPkP2kFeRy%2BLg%2FuTIBSEM1HVQdk4EEAdQYCQ%3D&reserved=0 > > > > > > > > > < > > > > > > https://gbr01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.oracle.com%2Fjavase%2F8%2Fdocs%2Fapi%2Fjava%2Flang%2FString.html%3Fis-external%3Dtrue&data=05%7C01%7Cpeter.lancaster%40findmypast.com%7C52e71d1ca9294234c62808da58eaa4a0%7C75e41e0807c2445db397039b2b54c244%7C0%7C0%7C637920062049080011%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=3gPAaYNOQvAkYD8coSuGjm28gau5i3lEJabT4Kqu%2BCk%3D&reserved=0 > > > >> > > > ids, > > > int commitWithinMs) > > > > > > > > > > > > > > > On Tue, Jun 28, 2022 at 12:58 PM Radu Gheorghe < > > radu.gheor...@sematext.com >
Re: Delete by Id in solr cloud
Hi, r ohara, Yeah, I found the bug id:https://issues.apache.org/jira/browse/SOLR-8889. seems like it was fixed in solr 8.10 and 9. so nothing worked besides using delete by query? On Thu, Jun 30, 2022 at 3:41 AM r ohara wrote: > Hi Satya, > I think it's a bug with using compositeId. We had the same issue, and had > to use deleteByQuery instead, but like you said, it's much slower. We're > using solr 8.11 > > On Tue, Jun 28, 2022 at 4:59 AM Satya Nand .invalid> > wrote: > > > Thanks, Peter, > > I am checking that, also UpdateRequest class seems to have methods that > > take routes as input. I will see if it helps. > > > > On Tue, Jun 28, 2022 at 3:19 PM Peter Lancaster < > > peter.lancas...@findmypast.com> wrote: > > > > > Hi Satya, > > > > > > I think you would need to use a HttpSolrClient that uses the url of the > > > shard where the record exists. > > > > > > Regards, > > > Peter. > > > > > > -Original Message- > > > From: Satya Nand > > > Sent: 28 June 2022 10:43 > > > To: users@solr.apache.org > > > Subject: Re: Delete by Id in solr cloud > > > > > > EXTERNAL SENDER: Do not click any links or open any attachments unless > > you > > > trust the sender and know the content is safe. > > > > > > > > > Hi Radu, > > > > > > I am using solrj for executing the query. I couldn't find any function > > > with accepts additional parameters like routing, shards, solr Params > etc. > > > > > > I also tried delete by query instead of deleteById, But it is very > slow. > > > > > > > > > > > > https://gbr01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsolr.apache.org%2Fdocs%2F8_1_0%2Fsolr-solrj%2Forg%2Fapache%2Fsolr%2Fclient%2Fsolrj%2Fimpl%2FCloudSolrClient.html&data=05%7C01%7Cpeter.lancaster%40findmypast.com%7C52e71d1ca9294234c62808da58eaa4a0%7C75e41e0807c2445db397039b2b54c244%7C0%7C0%7C637920062049080011%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=4d7N5LCpx8TXnEv7GW%2BN2TmoE8YvHa0tgr4c%2FamgOBw%3D&reserved=0 > > > deleteById > > > < > > > > > > https://gbr01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsolr.apache.org%2Fdocs%2F7_3_1%2Fsolr-solrj%2Forg%2Fapache%2Fsolr%2Fclient%2Fsolrj%2FSolrClient.html%23deleteById-java.lang.String-java.util.List-int-&data=05%7C01%7Cpeter.lancaster%40findmypast.com%7C52e71d1ca9294234c62808da58eaa4a0%7C75e41e0807c2445db397039b2b54c244%7C0%7C0%7C637920062049080011%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=MdBKsoMlbTqUjx5xzUny1Hrop0La2cwkg6cVZgZ76Es%3D&reserved=0 > > > > > > > (String > > > < > > > > > > https://gbr01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.oracle.com%2Fjavase%2F8%2Fdocs%2Fapi%2Fjava%2Flang%2FString.html%3Fis-external%3Dtrue&data=05%7C01%7Cpeter.lancaster%40findmypast.com%7C52e71d1ca9294234c62808da58eaa4a0%7C75e41e0807c2445db397039b2b54c244%7C0%7C0%7C637920062049080011%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=3gPAaYNOQvAkYD8coSuGjm28gau5i3lEJabT4Kqu%2BCk%3D&reserved=0 > > > > > > > collection, List > > > < > > > > > > https://gbr01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.oracle.com%2Fjavase%2F8%2Fdocs%2Fapi%2Fjava%2Futil%2FList.html%3Fis-external%3Dtrue&data=05%7C01%7Cpeter.lancaster%40findmypast.com%7C52e71d1ca9294234c62808da58eaa4a0%7C75e41e0807c2445db397039b2b54c244%7C0%7C0%7C637920062049080011%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=bUQ0Fe0pPkP2kFeRy%2BLg%2FuTIBSEM1HVQdk4EEAdQYCQ%3D&reserved=0 > > > > > > > > > < > > > > > > https://gbr01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.oracle.com%2Fjavase%2F8%2Fdocs%2Fapi%2Fjava%2Flang%2FString.html%3Fis-external%3Dtrue&data=05%7C01%7Cpeter.lancaster%40findmypast.com%7C52e71d1ca9294234c62808da58eaa4a0%7C75e41e0807c2445db397039b2b54c244%7C0%7C0%7C637920062049080011%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=3gPAaYNOQvAkYD8coSuGjm28gau5i3lEJabT4Kqu%2BCk%3D&reserved=0 > > > >> > > > ids, > > > int commitWithinMs) > > > > > > > > > > > > > > > On Tue, Jun 28, 2022 at 12:58 PM Radu Gheorghe < > > radu.gheor...@semat
Forcing solr to run query on replica Nodes
Hi, Currently, we have 8+1 solr nodes cluster. Where 1 Indexing node contains all(8) NRT Primary shards. This is where all indexing happens. Then We have another 8 nodes consisting of one pull replica of each primary shard. For querying, we have used *shard.preference as PULL *so that all queries are returned from pull replicas. How can I force solr to use only pull replicas? in case one of the pull replicas is not available then I want partial results to be returned from 7 replicas but never want to query on NRT replicas.
Re: Forcing solr to run query on replica Nodes
Thank you Shawn. If I eliminate this indexing node and create 8 NRT shards on these 8 query nodes. Meaning indexing will be happening on all 8 nodes and queries too. Will it create any impact on response time? currency commit interval is 15 minus. On Tue, Aug 30, 2022 at 8:46 PM Shawn Heisey wrote: > On 8/30/22 08:08, Satya Nand wrote: > > For querying, we have used *shard.preference as PULL *so that all queries > > are returned from pull replicas. > > > > How can I force solr to use only pull replicas? in case one of the pull > > replicas is not available then I want partial results to be returned > from 7 > > replicas but never want to query on NRT replicas. > > With that shards.preference set to a replica type of PULL, it will only > go to NRT if it has no other choice. > > I am not aware of any way to *force* it to only use the preferred type. > Creating an option for that has the potential to interfere with high > availability, so I don't know how receptive devs will be to the idea. > You should open an enhancement issue in Jira. > > Thanks, > Shawn > >
Re: Forcing solr to run query on replica Nodes
. On Thu, Sep 1, 2022 at 11:22 AM Satya Nand wrote: > Thank you Shawn. > If I eliminate this indexing node and create 8 NRT shards on these 8 query > nodes. Meaning indexing will be happening on all 8 nodes and queries too. > > Will it create any impact on response time? currency commit interval is 15 > minus. > > On Tue, Aug 30, 2022 at 8:46 PM Shawn Heisey > wrote: > >> On 8/30/22 08:08, Satya Nand wrote: >> > For querying, we have used *shard.preference as PULL *so that all >> queries >> > are returned from pull replicas. >> > >> > How can I force solr to use only pull replicas? in case one of the pull >> > replicas is not available then I want partial results to be returned >> from 7 >> > replicas but never want to query on NRT replicas. >> >> With that shards.preference set to a replica type of PULL, it will only >> go to NRT if it has no other choice. >> >> I am not aware of any way to *force* it to only use the preferred type. >> Creating an option for that has the potential to interfere with high >> availability, so I don't know how receptive devs will be to the idea. >> You should open an enhancement issue in Jira. >> >> Thanks, >> Shawn >> >>
Re: Forcing solr to run query on replica Nodes
Thanks, Shawn. I will try to create another post covering as many details as possible, then somebody from this mailing list can help me in reviewing my architecture, On Wed, Sep 7, 2022 at 9:21 PM Shawn Heisey wrote: > On 8/31/22 23:52, Satya Nand wrote: > > Thank you Shawn. > > If I eliminate this indexing node and create 8 NRT shards on these 8 > query > > nodes. Meaning indexing will be happening on all 8 nodes and queries too. > > > > Will it create any impact on response time? currency commit interval is > 15 > > minus. > > > Heavy indexing most likely would impact response time. Whether it's a > significant impact is something I can't predict. There are simply too > many unknowns, and even with a lot more information, any prediction > would just be a guess. > > Thanks, > Shawn > >
Solr Cloud Architecture Recommendations
Hi All, We have recently moved from solr 6.5 to solr cloud 8.10. *Earlier Architecture:*We were using a master-slave architecture where we had 4 slaves(14 cpu, 96 GB ram, 20 GB Heap, 110 GB index size). We used to optimize and replicate nightly. *Now.* We didn't have a clear direction on the number of shards. So we did some POC with variable numbers of shards. We found that with 8 shards we were close to the response time we were getting earlier without using too much infrastructure. Based on our queries we couldn't find a routing parameter so now all queries are being broadcasted to every shard. Now, we have 8+1 solr nodes cluster. Where 1 Indexing node contains all(8) NRT Primary shards. This is where all indexing happens. Then We have another 8 nodes each having ( 10 cpu, 42 GB ram,8 GB heap ~23 GB Index) consisting of one pull replica of each primary shard. For querying, we have used *shard.preference as PULL *so that all queries are returned from pull replicas. Our thought process was that we should have the indexing layer and query layer separate so one does not affect the other. we made it live this week. Though it didn't help in reducing the response time, in fact, we found an increase in average response time. We found a substantial impact on response time after 85 percentile response time, So timeouts reduced significantly. *Now I have a few questions for all the guys who are using solr cloud to help me understand and increase the stability of my cluster. * 1. Were we right to assume to separate indexing and query layer? is it a good idea? or something else could have been done better? because right now it can affect our cluster stability, if in case replica node is not available then queries will start going to indexing node, which is very weak and it could choke the whole cluster. 2. is there any guideline for the number of shards and shards size? 3. How to decide the ideal number of CPUs to have per node? is there any metric we can follow like load or CPU usage? what should be the ideal CPU usages and load average based on the number of CPU ? because our response time increases exponentially with the traffic. 250 ms to 400 ms in peak hours. Peak hour traffic remains at 2000 requests per minute. cpu usages at 55% and load average at ~6(10 cpu) 4. How do decide the number of nodes based on shards or any other metric? should one increase nodes or CPUs on existing nodes? 5 how to handle dev and stage environments, should we have other smaller clusters or any other approach? 6. Did your infrastructure requirement also increase compared to standalone when moving to the cloud, if yes then how much? 7. How do you maintain versioning of config in zookeeper? 8, any performance issue you faced or any other recommendation?
Re: Solr Cloud Architecture Recommendations
Thank you Eric for your reply. Please find my response below. > 1. Were we right to assume to separate indexing and query layer? is it a > > good idea? or something else could have been done better? because right > > now it can affect our cluster stability, if in case replica node is not > > available then queries will start going to indexing node, which is very > > weak and it could choke the whole cluster. > This is a good question.. since eventually your updates go from your > leaders to all your replicas, I would start with just making all nodes the > same size, and not try to have two layers.I think, as you learn more, > maybe you could come up with more exotic layouts, but for now, just have > every node the same size, and let every node do the work. The only reason > to pull out your leaders is if they somehow do EXTRA work on indexing that > other nodes wouldn’t do…. > Our application involves lots of daily updates in data. We regularly update approx 40-50%(~50 million) and we index it continuously. Our replica's of pull type. So our indexing nodes are actually doing a lot of work that others are not. this was the reason we decide to separate those layers to maximize the performance of the query layer. > 2. is there any guideline for the number of shards and shards size? > If your queries are taking a long time due to the volume of data to query > over, then shard. If you previously had a single leader/follower (the new > terms for master/slave) and didn’t have performance issues, then I would > have 1 shard, with 1 leader and 4 replicas. That would most closely > replicate your previous setup. We were actually facing performance issues. though our response time was better than cloud it lacked stability. We had to optimize it daily and only one time replication of all the indexed data, so we sacrificed real-time searching for stability. some queries took a lot of time to execute with more results. Hence we decide to move to the cloud and thought of doing sharding. We tried many combinations of the number of shards and found that 20-25 gb shards gave us the closest response time to the earlier. > 3. How to decide the ideal number of CPUs to have per node? is there any > > metric we can follow like load or CPU usage? > > what should be the ideal CPU usages and load average based on the number > of > > CPU ? > > because our response time increases exponentially with the traffic. 250 > ms > > to 400 ms in peak hours. Peak hour traffic remains at 2000 requests per > > minute. cpu usages at 55% and load average at ~6(10 cpu) > Lots of great stuff around Grafana and other tooling to get data, but I > don’t have a specific answer. Yes, we do have detailed dashboards for all the metrics. but I was looking for some general guidelines that if your load or CPU usage is more than some number then you probably need to increase the core count or any other defined equation. It becomes easier to fine-tune the system and convince the infra team to allocate more resources. Because we have seen that when load/CPU usage is less in the offpeak hours, response time is significantly low. 5 how to handle dev and stage environments, should we have other smaller > > clusters or any other approach? > On dev and stage, since having more replicas is to support volume of > queries, then I think you are okay with having just 1 leader and 1 > follower, or even just having the leader.Now, if you need to shard to > support your query cause it takes a long time, then you can do that. So we do need to have different small clusters (master only or fewer replicas.) It will require that either we use our primary indexing process to index data on all clusters, which might make it a little slow, or a different indexing process to index everything for each cluster, this will lead to different data in each cluster and different results in calibration. I was thinking if there was some way we could sync the data from the primary cluster without hampering its performance. > 7. How do you maintain versioning of config in zookeeper? > I bootstrap my configset and use the api’s. > https://github.com/querqy/chorus/blob/main/quickstart.sh#L119 < > https://github.com/querqy/chorus/blob/main/quickstart.sh#L119> for an > example. Thank you, I will check this. On Thu, Sep 8, 2022 at 11:43 PM Eric Pugh wrote: > Lots of good questions here, I’ll inline a couple of answers... > > > On Sep 8, 2022, at 1:59 AM, Satya Nand > wrote: > > > > Hi All, > > > > We have recently moved from solr 6.5 to solr cloud 8.10. > > > > > > *Earlier Architecture:*We were using a master-slave architecture where we > > had 4 slaves(14 cpu, 96 GB ram, 20 GB Heap, 110 GB index size). We used > to > >
Re: Solr Cloud Architecture Recommendations
Hi Matthew, In my experience sharding really slows you down because of all the > extra network chatter. Yes, we have also faced the same, But it is not about the cloud we could never match the response time of our old solr(6.5) with an upgraded one(8.7,8.10), even without the cloud. 6.5 was always low probably due to how some graph queries were re-implemented in solr 8.5. https://lists.apache.org/thread/kbjgztckqdody9859knq05swvx5xj20f But the cloud has helped us bring the response time down after 85 percentiles. So reduced timeouts. Do you index continuously or nightly or what? You should never need > to optimize. Our application involves lots of daily updates in data. We regularly update approx 40-50%(~50 million) and we index it continuously.(15 minutes commit interval) earlier we used to optimize with standalone solr to reduce response time. Check out your cache performance (in JMX or the solr ui) and increase > those if you index infrequently. Ideally your entire index should be > landing in memory. The are some cache stats on a randomly taken node from the cluster.(8GB Heap size). Let me know if you find something very wrong. We took the same configuration from our standalone solr (6.5) *queryResultCacheclass:org.apache.solr.search.CaffeineCachedescription:Caffeine Cache(maxSize=3, initialSize=1000, autowarmCount=100, regenerator=org.apache.solr.search.SolrIndexSearcher$3@477e8951)* stats: CACHE.searcher.queryResultCache.lookups:18315 CACHE.searcher.queryResultCache.cumulative_lookups:14114139 CACHE.searcher.queryResultCache.ramBytesUsed:453880928 CACHE.searcher.queryResultCache.inserts:12747 CACHE.searcher.queryResultCache.warmupTime:11397 CACHE.searcher.queryResultCache.hitratio:0.3576303576303576 CACHE.searcher.queryResultCache.maxRamMB:-1 CACHE.searcher.queryResultCache.cumulative_inserts:9995188 CACHE.searcher.queryResultCache.evictions:0 CACHE.searcher.queryResultCache.cumulative_evictions:83119 CACHE.searcher.queryResultCache.size:11836 CACHE.searcher.queryResultCache.cumulative_hitratio:0.34904764647705394 CACHE.searcher.queryResultCache.cumulative_hits:4926507 CACHE.searcher.queryResultCache.hits:6550 *filterCacheclass:org.apache.solr.search.CaffeineCachedescription:Caffeine Cache(maxSize=1000, initialSize=300, autowarmCount=100, regenerator=org.apache.solr.search.SolrIndexSearcher$2@4b97c627)* stats: CACHE.searcher.filterCache.hits:254221 CACHE.searcher.filterCache.cumulative_evictions:18495260 CACHE.searcher.filterCache.size:1000 CACHE.searcher.filterCache.maxRamMB:-1 CACHE.searcher.filterCache.hitratio:0.8998527506601443 CACHE.searcher.filterCache.warmupTime:4231 CACHE.searcher.filterCache.evictions:27376 CACHE.searcher.filterCache.cumulative_hitratio:0.9034759627596836 CACHE.searcher.filterCache.lookups:282514 CACHE.searcher.filterCache.cumulative_hits:187752452 CACHE.searcher.filterCache.cumulative_inserts:20058521 CACHE.searcher.filterCache.ramBytesUsed:192294056 CACHE.searcher.filterCache.inserts:28293 CACHE.searcher.filterCache.cumulative_lookups:207811231 *documentCacheclass:org.apache.solr.search.CaffeineCachedescription:Caffeine Cache(maxSize=25000, initialSize=512, autowarmCount=512, regenerator=null)* stats: CACHE.searcher.documentCache.evictions:341795 CACHE.searcher.documentCache.hitratio:0.5356143571564221 CACHE.searcher.documentCache.ramBytesUsed:60603608 CACHE.searcher.documentCache.cumulative_hitratio:0.5356143571564221 CACHE.searcher.documentCache.lookups:789850 CACHE.searcher.documentCache.hits:423055 CACHE.searcher.documentCache.cumulative_hits:423055 CACHE.searcher.documentCache.cumulative_evictions:341795 CACHE.searcher.documentCache.maxRamMB:-1 CACHE.searcher.documentCache.cumulative_lookups:789850 CACHE.searcher.documentCache.size:25000 CACHE.searcher.documentCache.inserts:366795 CACHE.searcher.documentCache.warmupTime:0 CACHE.searcher.documentCache.cumulative_inserts:366795 On Fri, Sep 9, 2022 at 1:43 AM matthew sporleder wrote: > In my experience sharding really slows you down because of all the > extra network chatter. > > Do you index continuously or nightly or what? You should never need > to optimize. > > Check out your cache performance (in JMX or the solr ui) and increase > those if you index infrequently. Ideally your entire index should be > landing in memory. > > On Thu, Sep 8, 2022 at 1:59 AM Satya Nand > wrote: > > > > Hi All, > > > > We have recently moved from solr 6.5 to solr cloud 8.10. > > > > > > *Earlier Architecture:*We were using a master-slave architecture where we > > had 4 slaves(14 cpu, 96 GB ram, 20 GB Heap, 110 GB index size). We used > to > > optimize and replicate nightly. > > > > *Now.* > > We didn't have a clear direction on the number of shards. So we did some > > POC with variable numbers of shards. We found that with 8 shards we were > > close to the respo
Pull Interval in Pull Type replicas of Solr Cloud ?
Hi, Is there any configuration, where we can define the replication interval when pull replica should pull indexes from NRT replicas?
Re: Pull Interval in Pull Type replicas of Solr Cloud ?
Hi Markus, thank you. so in this case the Commit interval will become the polling interval? frequent commit => frequent replication ? On Mon, Oct 3, 2022 at 3:31 PM Markus Jelsma wrote: > Hello Satya, > > There is no replication interval to define. The PULL of TLOG replicas will > pull new segment data from the current shard leader as they become > available. No specific configuration is needed. > > Regards, > Markus > > Op ma 3 okt. 2022 om 11:48 schreef Satya Nand > : > > > Hi, > > > > Is there any configuration, where we can define the replication interval > > when pull replica should pull indexes from NRT replicas? > > >
removing deleted documents in solr cloud (8.10)
Hi, This is what our solr cloud's merge policy looks like. and we have approx 30% deleted documents in the index. - is this normal? - If not How can I decrease the number of deleted documents? - Will the above help us in response time? * 200 53 *
Re: removing deleted documents in solr cloud (8.10)
Thanks, Marcus, I will try it and will let you know. On Mon, Oct 3, 2022 at 6:05 PM Markus Jelsma wrote: > Hello Satya, > > This is what our solr cloud's merge policy looks like. and we have approx > > 30% deleted documents in the index. > > > > >- is this normal? > > > > It depends on how often you delete or overwrite existing documents, > although i find 30% to be a little too high for my comfort. Our various > Solr collections are all very different from eachother, it ranges from 0.3% > to 9%, and 16% and even 24%. Very normal for the way they are used. > > > >- If not How can I decrease the number of deleted documents? > > > > You can try setting at > TieredMergePolicyFactory [1], it defaults to 33%. I am not sure if it will > work though so please report back if you can. > > > >- Will the above help us in response time? > > > > It depends, but possibly. A lower value means more merging and thus more IO > on the leader node. If it is separated from the follower node and you only > query that node, then you should have smaller indexes and so see better > response times. > > Regards, > Markus > > [1] > > https://lucene.apache.org/core/8_9_0/core/org/apache/lucene/index/TieredMergePolicy.html#setDeletesPctAllowed-double- > > Op ma 3 okt. 2022 om 12:06 schreef Satya Nand > : > > > Hi, > > > > This is what our solr cloud's merge policy looks like. and we have approx > > 30% deleted documents in the index. > > > > > > > >- is this normal? > >- If not How can I decrease the number of deleted documents? > >- Will the above help us in response time? > > > > > > > > > > > > > > > > * 200 > class="org.apache.solr.index.TieredMergePolicyFactory"> > name="maxMergeAtOnce">5 > name="segmentsPerTier">3 * > > > > Op ma 3 okt. 2022 om 12:06 schreef Satya Nand > : > > > Hi, > > > > This is what our solr cloud's merge policy looks like. and we have approx > > 30% deleted documents in the index. > > > > > > > >- is this normal? > >- If not How can I decrease the number of deleted documents? > >- Will the above help us in response time? > > > > > > > > > > > > > > > > * 200 > class="org.apache.solr.index.TieredMergePolicyFactory"> > name="maxMergeAtOnce">5 > name="segmentsPerTier">3 * > > >
maximum searchers registered per replica in solr cloud.
Hi, We have an 8-node cluster, Where each node contains only 1 replica of PULL time. For the last few hours suddenly we are facing high CPU usage and load on one of the servers (double of the other servers) Upon checking in the Solr Graphna Dashboard it was found that *Mapped Total Capacity(Jvm Metrices->Buffer size section) *for this particular node was approx double of other servers 54GB vs 28 GB. Further checking in *CORE (Plugin/Stats) *for this particular server, There were two searchers registered for this core. something like this *Searcher@56aea1d7[im-search-03-08-22_shard2_replica_p19] mainsearchercoreSearcher@6f3bd5b7[im-search-03-08-22_shard2_replica_p19] main* I checked other servers as well for the count of searchers, I only found one for every core. for ex. *searcherSearcher@7a7e7ba3[im-search-03-08-22_shard1_replica_p17] maincore* Could these two searchers on one core have resulted in an increase in load and CPU usage? And what could have been the reason for two searchers, My understanding was that only one searcher is there for 1 core. If this is the issue can we do some configuration so that it doesn't happen again?
Re: maximum searchers registered per replica in solr cloud.
Hi Shawn, I dont think it was heap size. We have assigned only 8 GB heap size. This was named as solr mapped total capacity in grafana dashboard . Heap size section was also there where i could see the heap usages based on 8 gb. But gc count and time for this server was high as well. I guess it is something related to mmap directory implementation of index directory though I could be wrong. These 2 searchers are there since 11 am. And after every commit two more searchers are being opened again. Right now these are two searchers. - Searcher@479c8248[im-search-03-08-22_shard2_replica_p19] main - Searcher@6f3bd5b7[im-search-03-08-22_shard2_replica_p19] main Sharing the cache configuration as well. Total documents in this replica are 18 million.(25 mn maxdoc, 7 mn deleted doc. On Fri, 7 Oct 2022 at 6:32 PM, Shawn Heisey wrote: > On 10/7/22 06:23, Satya Nand wrote: > > Upon checking in the Solr Graphna Dashboard it was found that *Mapped > Total > > Capacity(Jvm Metrices->Buffer size section) *for this particular node was > > approx double of other servers 54GB vs 28 GB. > > > > Further checking in *CORE (Plugin/Stats) *for this particular server, > There > > were two searchers registered for this core. something like this > > Usually when there are multiple searchers, it's because there is an > existing searcher handling queries and at least one new searcher that is > being warmed as a replacement. When the new searcher is fully warmed, > the existing searcher will shut down as soon as all queries that are > using it are complete. > > 28GB of heap memory being assigned to the searcher seems extremely > excessive. Can you share the cache configuration in solrconfig.xml and > the max doc count in the core? > > Thanks, > Shawn > >
Re: maximum searchers registered per replica in solr cloud.
Shawn, I just observed that 2nd searcher is not getting closed. It was also there when I originally posted the question. On Fri, 7 Oct 2022 at 7:25 PM, Satya Nand wrote: > Hi Shawn, > > I dont think it was heap size. We have assigned only 8 GB heap size. This > was named as solr mapped total capacity in grafana dashboard . Heap size > section was also there where i could see the heap usages based on 8 gb. But > gc count and time for this server was high as well. > > I guess it is something related to mmap directory implementation of index > directory though I could be wrong. > > These 2 searchers are there since 11 am. And after every commit two more > searchers are being opened again. > > Right now these are two searchers. > >- >Searcher@479c8248[im-search-03-08-22_shard2_replica_p19] main >- Searcher@6f3bd5b7[im-search-03-08-22_shard2_replica_p19] main > >Sharing the cache configuration as well. Total documents in this >replica are 18 million.(25 mn maxdoc, 7 mn deleted doc. >autowarmCount="100" /> size="3" initialSize="1000" autowarmCount="100" /> class="solr.CaffeineCache" size="25000" initialSize="512" autowarmCount >="512" /> > > > > > > On Fri, 7 Oct 2022 at 6:32 PM, Shawn Heisey wrote: > >> On 10/7/22 06:23, Satya Nand wrote: >> > Upon checking in the Solr Graphna Dashboard it was found that *Mapped >> Total >> > Capacity(Jvm Metrices->Buffer size section) *for this particular node >> was >> > approx double of other servers 54GB vs 28 GB. >> > >> > Further checking in *CORE (Plugin/Stats) *for this particular server, >> There >> > were two searchers registered for this core. something like this >> >> Usually when there are multiple searchers, it's because there is an >> existing searcher handling queries and at least one new searcher that is >> being warmed as a replacement. When the new searcher is fully warmed, >> the existing searcher will shut down as soon as all queries that are >> using it are complete. >> >> 28GB of heap memory being assigned to the searcher seems extremely >> excessive. Can you share the cache configuration in solrconfig.xml and >> the max doc count in the core? >> >> Thanks, >> Shawn >> >>
Re: maximum searchers registered per replica in solr cloud.
Shawn, Yes earlier I did not find out about the multiple searchers. I reached to this after debugging a lot. And now i can confirm it was the same searcher when i posted the question first time. I reloaded the collection 15 minutes ago and now the stuck searcher is gone. So i won’t be able to tell you the warmup time for stuck searcher. But after reloading now the mapped total capacity has reached to 75 GB. On other servers it is still 25 GB. We do use our custom phonetic implementation. But it is running since years. On Fri, 7 Oct 2022 at 7:39 PM, Shawn Heisey wrote: > On 10/7/22 07:55, Satya Nand wrote: > > I guess it is something related to mmap directory implementation of index > > directory though I could be wrong. > > That makes sense. MMAP space is not a count of memory actually taken, > just the amount of data that is accessible. The OS manages how much of > that data is actually in memory. > > > These 2 searchers are there since 11 am. And after every commit two more > > searchers are being opened again. > > I don't know what timezone you are in, but I am assuming that was quite > a while ago. Are you sure it's the exact same searcher that you saw at > 11 AM? What is the warmupTime for each searcher you see in Plugins/Stats? > > Just saw your additional message. Are you running with any modules or > custom code? Something that required loading extra jars? > > Thanks, > Shawn > >
Re: maximum searchers registered per replica in solr cloud.
Shawn, After sometime mapped total capacity reached to 100 GB(probably after some commit). Young GC count doubled and server threw out of memory error. Any insights? What we can check to avoid this? On Fri, 7 Oct, 2022, 8:03 pm Satya Nand, wrote: > Shawn, > > Yes earlier I did not find out about the multiple searchers. I reached to > this after debugging a lot. > And now i can confirm it was the same searcher when i posted the question > first time. > > I reloaded the collection 15 minutes ago and now the stuck searcher is > gone. So i won’t be able to tell you the warmup time for stuck searcher. > > But after reloading now the mapped total capacity has reached to 75 GB. On > other servers it is still 25 GB. > > > We do use our custom phonetic implementation. But it is running since > years. > > > > > On Fri, 7 Oct 2022 at 7:39 PM, Shawn Heisey wrote: > >> On 10/7/22 07:55, Satya Nand wrote: >> > I guess it is something related to mmap directory implementation of >> index >> > directory though I could be wrong. >> >> That makes sense. MMAP space is not a count of memory actually taken, >> just the amount of data that is accessible. The OS manages how much of >> that data is actually in memory. >> >> > These 2 searchers are there since 11 am. And after every commit two more >> > searchers are being opened again. >> >> I don't know what timezone you are in, but I am assuming that was quite >> a while ago. Are you sure it's the exact same searcher that you saw at >> 11 AM? What is the warmupTime for each searcher you see in Plugins/Stats? >> >> Just saw your additional message. Are you running with any modules or >> custom code? Something that required loading extra jars? >> >> Thanks, >> Shawn >> >>
Shards Parameter causing Infinite loop in solr cloud search
Hi, Good Morning. we have 8+1 solr nodes cluster. Where 1 Indexing node contains all(8) NRT Primary shards. This is where all indexing happens. Then We have another 8 nodes consisting of one pull replica of each primary shard. To limit the query on replicas we have done the following changes in solrconfig and shard whitelisting. true 10.128.74.11:6086/solr/im-search,10.128.74.11:6087/solr/im-search But after the changes, the requests are going in an infinite loop. I found this in the documentation but I couldn't understand what is standard vs nonstandard request handler is. *"Do not add the shards parameter to the standard request handler; doing so > may cause search queries may enter an infinite loop. Instead, define a new > request handler that uses the shards parameter, and pass distributed search > requests to that handler."* so, How can we use shards parameters in solr config to limit the shards? I think one alternative will be to pass the shards parameter in URL instead of solrconfig. But we would want to use the solrconfig to limit the changes in config only.
Re: Shards Parameter causing Infinite loop in solr cloud search
Hi Shawn, > > The standard request handler is usually the one named "/select". You may > want to add a new handler for this purpose. We are already using a custom request handler, Actually, there is no /select handler in our solr config. your message subject says you are in cloud mode. If that is true, I > think you are going to want to specify shards by name, not URL. Yes, we are using solr cloud. The reason I don't want to specify the shards' names is that a request can be sent to any replica of a shard based on preference and availability but I specifically want to limit a request to a PULL-type replica of a shard. I am trying to replicate the behavior on this link, https://solr.apache.org/guide/8_4/distributed-requests.html#limiting-which-shards-are-queried This section < > Alternatively, you can specify a list of replicas you wish to use in place > of a shard IDs by separating the replica IDs with commas: http://localhost:8983/solr/gettingstarted/select?q=*:*&shards=localhost:7574/solr/gettingstarted,localhost:8983/solr/gettingstarted > But when I do this, my request goes in an infinite loop. is there anything I can do to make it work? I just want to use some specific set of replicas with shard.tolrent=true. On Mon, Oct 10, 2022 at 5:07 PM Shawn Heisey wrote: > On 10/10/22 01:57, Satya Nand wrote: > > >> *"Do not add the shards parameter to the standard request handler; > doing so > >> may cause search queries may enter an infinite loop. Instead, define a > new > >> request handler that uses the shards parameter, and pass distributed > search > >> requests to that handler."* > > > > so, How can we use shards parameters in solr config to limit the shards? > I > > think one alternative will be to pass the shards parameter in URL instead > > of solrconfig. > > > > But we would want to use the solrconfig to limit the changes in config > only. > > The standard request handler is usually the one named "/select". You may > want to add a new handler for this purpose. > > Your message subject says you are in cloud mode. If that is true, I > think you are going to want to specify shards by name, not URL. If you > are in standalone mode (no zookeeper) then the way I handled that was to > build a special core with an empty index that had a predefined list of > shard URLs in the /select handler. When I did that, I was using the > "defaults" parameter config. I think if I did it again I would use > "invariants" so the user would not be able to override the list. > > Thanks, > Shawn > > On Mon, Oct 10, 2022 at 5:07 PM Shawn Heisey wrote: > On 10/10/22 01:57, Satya Nand wrote: > > >> *"Do not add the shards parameter to the standard request handler; > doing so > >> may cause search queries may enter an infinite loop. Instead, define a > new > >> request handler that uses the shards parameter, and pass distributed > search > >> requests to that handler."* > > > > so, How can we use shards parameters in solr config to limit the shards? > I > > think one alternative will be to pass the shards parameter in URL instead > > of solrconfig. > > > > But we would want to use the solrconfig to limit the changes in config > only. > > The standard request handler is usually the one named "/select". You may > want to add a new handler for this purpose. > > Your message subject says you are in cloud mode. If that is true, I > think you are going to want to specify shards by name, not URL. If you > are in standalone mode (no zookeeper) then the way I handled that was to > build a special core with an empty index that had a predefined list of > shard URLs in the /select handler. When I did that, I was using the > "defaults" parameter config. I think if I did it again I would use > "invariants" so the user would not be able to override the list. > > Thanks, > Shawn > >
Re: Shards Parameter causing Infinite loop in solr cloud search
Shawn, Actually we were using the preference parameter but recently we faced an issue where 1 pull replica got down(due to gcp machine restart) and requests started going to the NRT replica. Machine hosting NRT replica is pretty weak. That's why I was experimenting with with shards parameter with all the urls of pull replicas. So request has no option to go to any other shards. Also planning to use shards.tolrent so that in case one or more replica is down , we can get the response from remaining replicas. Based on the Documentation link I have posted, it says that we can use but i cam not able to make it work. On Mon, 10 Oct, 2022, 6:14 pm Shawn Heisey, wrote: > On 10/10/22 06:00, Satya Nand wrote: > > Yes, we are using solr cloud. The reason I don't want to specify the > > shards' names is that a request can be sent to any replica of a shard > based > > on preference and availability but I specifically want to limit a request > > to a PULL-type replica of a shard. > > > > I am trying to replicate the behavior on this link, > > This is a perfect use case for the shards.preference parameter. Use > "shards.preference=replica.type:PULL" along with a list of shard names. > If there is at least one PULL replica available for a shard, it will be > used. It will only try other replica types as a last resort. > > > https://solr.apache.org/guide/8_4/distributed-requests.html#shards-preference-parameter > > Thanks, > Shawn > >
Re: Shards Parameter causing Infinite loop in solr cloud search
Thanks Shawn for sharing all possibilities , we will try to evaluate all these. On Mon, 10 Oct, 2022, 6:45 pm Shawn Heisey, wrote: > On 10/10/22 06:58, Satya Nand wrote: > > Actually we were using the preference parameter but recently we faced an > > issue where 1 pull replica got down(due to gcp machine restart) and > > requests started going to the NRT replica. > > Machine hosting NRT replica is pretty weak. > > > > That's why I was experimenting with with shards parameter with all the > urls > > of pull replicas. So request has no option to go to any other shards. > > Also planning to use shards.tolrent so that in case one or more replica > is > > down , we can get the response from remaining replicas. > > Some choices: > > * Bump up the hardware hosting the NRT replicas so they can also handle > queries. > * Add another set of PULL replicas on separate hardware. > * Adjust your systems so that each one hosts a PULL replica for two > different shards. > * Rearrange things so that each system hosts an NRT replica for one > shard and a PULL replica for a different shard. > > Thanks, > Shawn > >
external File field with solr cloud
Hi, We are planning to use an external file field. How do we manage it in solr cloud? Do we need to make a copy of this file in every shard's data directory or somehow solr cloud can manage it. ?
Requests taking hours on solr cloud
Hi, Greetings for the day, We are facing a strange problem in Solr cloud where a few requests are taking hours to complete. Some requests return with a 0 status code and some with a 500 status code. The recent request took more than 5 hours to complete with only a 9k results count. These queries create problems in closing old searchers, Some times there are 3-4 searchers where one is a new searcher and the others are just stuck because a few queries are tracking hours. Finally, the application slows down horribly, and the load increases. I have downloaded the stack trace of the affected node and tried to analyze this stack trace online. but I couldn't get many insights from it. . Stack Trace: https://fastthread.io/my-thread-report.jsp?p=c2hhcmVkLzIwMjIvMTIvOC9sb2dzLnR4dC0tMTAtNTUtMzA=&; JVM Settings: We are using Parallel GC, can this be causing this much log pause? -XX:+UseParallelGC -XX:-OmitStackTraceInFastThrow -Xms12g -Xmx12g -Xss256k What more we can check here to find the root cause and prevent this from happening again? Thanks in advance
Re: Requests taking hours on solr cloud
++ GC report https://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMjIvMTIvOC9zb2xyX2djLmxvZy0tMTEtMzktMzE=&channel=WEB On Thu, Dec 8, 2022 at 4:57 PM Satya Nand wrote: > Hi, > > Greetings for the day, > > We are facing a strange problem in Solr cloud where a few requests are > taking hours to complete. Some requests return with a 0 status code and > some with a 500 status code. The recent request took more than 5 hours to > complete with only a 9k results count. > > > These queries create problems in closing old searchers, Some times there > are 3-4 searchers where one is a new searcher and the others are just stuck > because a few queries are tracking hours. Finally, the application slows > down horribly, and the load increases. > > I have downloaded the stack trace of the affected node and tried to > analyze this stack trace online. but I couldn't get many insights from it. > . > > Stack Trace: > > > https://fastthread.io/my-thread-report.jsp?p=c2hhcmVkLzIwMjIvMTIvOC9sb2dzLnR4dC0tMTAtNTUtMzA=&; > > JVM Settings: We are using Parallel GC, can this be causing this much log > pause? > > -XX:+UseParallelGC > -XX:-OmitStackTraceInFastThrow > -Xms12g > -Xmx12g > -Xss256k > > What more we can check here to find the root cause and prevent this from > happening again? > Thanks in advance > > > > > >
Re: Requests taking hours on solr cloud
Hi Mikhail, Thanks for the response. This instance mostly idling, at that time it was coordinating one request > and awaits shard's request to complete see The shard is waiting on itself. 10.128.193.11 is the private IP of the same node where I have taken this stack trace. in the below request, One node has a PULL replica and one node has an NRT replica. We have set the preference to PULL replicas. httpShardExecutor-7-thread-939362-processing-x:im-search-03-08-22_shard1_replica_p17 r:core_node18 http: 10.128.193.11:8985//solr//im-search-03-08-22_shard1_replica_p17//|http:10.128.99.14:8985//solr//im-search-03-08-22_shard1_replica_n1// n:10.128.193.11:8985_solr c:im-search-03-08-22 s:shard1 [http: 10.128.193.11:8985//solr//im-search-03-08-22_shard1_replica_p17//, http: 10.128.99.14:8985//solr//im-search-03-08-22_shard1_replica_n1//] I tried to track internal requests for this main request which took almost 5+ hours to execute with only 9k hits. and it had a 0 status code (successful). There were 12 requests with this RID. 8 requests got successful at 10:41, but 4 requests got successful at 16:22. I checked the response time of internal requests, and no requests had a response time greater than 100 ms. This means Solr was waiting on something before executing requests. what could be that? AFAIK ParallelGC > despite its name is quite old and not really performant. Earlier we were using java 8 and G1GC with default settings. Recently we decide to upgrade java to 15. After upgrading java to 15, the application wasn't performing well. even with fewer GC counts and less GC time system was on load in peak hours. We experimented with ZGC, but that also didn't help. we tried parallel GC, and the system was stable, with no sudden load peaks in peak hours. that's why we are continuing with parallel GC. On Thu, Dec 8, 2022 at 5:31 PM Mikhail Khludnev wrote: > Hi Satya. > This instance mostly idling, at that time it were coordinating one request > and awaits shard request to complete see > > https://fastthread.io/same-state-threads.jsp?state=non-daemon&dumpId=1#panel111 > > > https://fastthread.io/same-state-threads.jsp?state=non-daemon&dumpId=1#panel118 > that another instance might have some clues in stacktrace. Also, if you > have 500 errors there might be exceptions; slow query logging might be > enabled and can give more clues for troubleshooting. AFAIK ParallelGC > despite its name is quite old and not really performant. > > On Thu, Dec 8, 2022 at 2:28 PM Satya Nand .invalid> > wrote: > > > Hi, > > > > Greetings for the day, > > > > We are facing a strange problem in Solr cloud where a few requests are > > taking hours to complete. Some requests return with a 0 status code and > > some with a 500 status code. The recent request took more than 5 hours to > > complete with only a 9k results count. > > > > > > These queries create problems in closing old searchers, Some times there > > are 3-4 searchers where one is a new searcher and the others are just > stuck > > because a few queries are tracking hours. Finally, the application slows > > down horribly, and the load increases. > > > > I have downloaded the stack trace of the affected node and tried to > analyze > > this stack trace online. but I couldn't get many insights from it. > > . > > > > Stack Trace: > > > > > > > https://fastthread.io/my-thread-report.jsp?p=c2hhcmVkLzIwMjIvMTIvOC9sb2dzLnR4dC0tMTAtNTUtMzA=&; > > > > JVM Settings: We are using Parallel GC, can this be causing this much log > > pause? > > > > -XX:+UseParallelGC > > -XX:-OmitStackTraceInFastThrow > > -Xms12g > > -Xmx12g > > -Xss256k > > > > What more we can check here to find the root cause and prevent this from > > happening again? > > Thanks in advance > > > > > -- > Sincerely yours > Mikhail Khludnev >
Re: Requests taking hours on solr cloud
Hi Ere, We tried executing this request again and it didn't take any time. So it is not repeatable. average response time of all the queries around this period was only approx 100-200 ms. This was a group=true request where we get 14 groups and 5 results per group. So no deep pagination. On Fri, Dec 9, 2022 at 2:04 PM Ere Maijala wrote: > Hi, > > Are the same requests sometimes stalling and sometimes fast, or is it > some particular queries that take hours? > > There are some things you should avoid with SolrCloud, and deep paging > (i.e. a large number for the start or rows parameter) is a typical issue > (see e.g. https://yonik.com/solr/paging-and-deep-paging/ for more > information). > > Best, > Ere > > Satya Nand kirjoitti 8.12.2022 klo 13.27: > > Hi, > > > > Greetings for the day, > > > > We are facing a strange problem in Solr cloud where a few requests are > > taking hours to complete. Some requests return with a 0 status code and > > some with a 500 status code. The recent request took more than 5 hours to > > complete with only a 9k results count. > > > > > > These queries create problems in closing old searchers, Some times there > > are 3-4 searchers where one is a new searcher and the others are just > stuck > > because a few queries are tracking hours. Finally, the application slows > > down horribly, and the load increases. > > > > I have downloaded the stack trace of the affected node and tried to > analyze > > this stack trace online. but I couldn't get many insights from it. > > . > > > > Stack Trace: > > > > > https://fastthread.io/my-thread-report.jsp?p=c2hhcmVkLzIwMjIvMTIvOC9sb2dzLnR4dC0tMTAtNTUtMzA=&; > > > > JVM Settings: We are using Parallel GC, can this be causing this much log > > pause? > > > > -XX:+UseParallelGC > > -XX:-OmitStackTraceInFastThrow > > -Xms12g > > -Xmx12g > > -Xss256k > > > > What more we can check here to find the root cause and prevent this from > > happening again? > > Thanks in advance > > > > -- > Ere Maijala > Kansalliskirjasto / The National Library of Finland >
Re: Requests taking hours on solr cloud
Pinging on this thread again to bring it to the top. Any idea why one request is stuck for hours in solr cloud.? On Fri, Dec 9, 2022 at 3:35 PM Satya Nand wrote: > Hi Ere, > > We tried executing this request again and it didn't take any time. So it > is not repeatable. average response time of all the queries around this > period was only approx 100-200 ms. > > This was a group=true request where we get 14 groups and 5 results per > group. So no deep pagination. > > On Fri, Dec 9, 2022 at 2:04 PM Ere Maijala > wrote: > >> Hi, >> >> Are the same requests sometimes stalling and sometimes fast, or is it >> some particular queries that take hours? >> >> There are some things you should avoid with SolrCloud, and deep paging >> (i.e. a large number for the start or rows parameter) is a typical issue >> (see e.g. https://yonik.com/solr/paging-and-deep-paging/ for more >> information). >> >> Best, >> Ere >> >> Satya Nand kirjoitti 8.12.2022 klo 13.27: >> > Hi, >> > >> > Greetings for the day, >> > >> > We are facing a strange problem in Solr cloud where a few requests are >> > taking hours to complete. Some requests return with a 0 status code and >> > some with a 500 status code. The recent request took more than 5 hours >> to >> > complete with only a 9k results count. >> > >> > >> > These queries create problems in closing old searchers, Some times >> there >> > are 3-4 searchers where one is a new searcher and the others are just >> stuck >> > because a few queries are tracking hours. Finally, the application slows >> > down horribly, and the load increases. >> > >> > I have downloaded the stack trace of the affected node and tried to >> analyze >> > this stack trace online. but I couldn't get many insights from it. >> > . >> > >> > Stack Trace: >> > >> > >> https://fastthread.io/my-thread-report.jsp?p=c2hhcmVkLzIwMjIvMTIvOC9sb2dzLnR4dC0tMTAtNTUtMzA=&; >> > >> > JVM Settings: We are using Parallel GC, can this be causing this much >> log >> > pause? >> > >> > -XX:+UseParallelGC >> > -XX:-OmitStackTraceInFastThrow >> > -Xms12g >> > -Xmx12g >> > -Xss256k >> > >> > What more we can check here to find the root cause and prevent this from >> > happening again? >> > Thanks in advance >> > >> >> -- >> Ere Maijala >> Kansalliskirjasto / The National Library of Finland >> >
Re: Grouping fails with "null" as the output value, while filter is working
Hi, mailing list strips the attachments, you probably need to provide links for screenshots. On Thu, Dec 22, 2022 at 2:17 PM NEERAJ VERMA wrote: > Solr's Grouping feature is not working for one of our datasets. It yields > the grouping field's value as *null* at times, randomly (we haven't > established a pattern so far), and puts documents in a single bucket while > the field's value is different for each doc. Filtering on the actual value > of that field is working fine, though. > > Our schema on this particular field is the same for the last three months, > and data ingested just last week is impacted by this issue. > > In the first screenshot, you can see the group value as null while the > field has a value. > The next picture contains the same field applied as a filter, and yielding > the right number of matches against it but the grouping is still failing. > > > > - Neeraj >
Re: Requests taking hours on solr cloud
Hi Dominique, I looked at the stack trace but I couldn't know for sure why the thread is waiting. can anyone help me in decoding this? httpShardExecutor-7-thread-939362-processing-x:im-search-03-08-22_shard1_replica_p17 r:core_node18 http: 10.128.193.11:8985//solr//im-search-03-08-22_shard1_replica_p17//|http:10.128.99.14:8985//solr//im-search-03-08-22_shard1_replica_n1// n:10.128.193.11:8985_solr c:im-search-03-08-22 s:shard1 [http: 10.128.193.11:8985//solr//im-search-03-08-22_shard1_replica_p17//, http: 10.128.99.14:8985//solr//im-search-03-08-22_shard1_replica_n1//] PRIORITY : 5 THREAD ID : 0X7FE6180494C0 NATIVE ID : 0X54E3 NATIVE ID (DECIMAL) : 21731 STATE : WAITING stackTrace: java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(java.base@15.0.2/Native Method) - waiting on at java.lang.Object.wait(java.base@15.0.2/Object.java:321) at org.eclipse.jetty.client.util.InputStreamResponseListener$Input.read( InputStreamResponseListener.java:318) - locked <0x00054cd27c88> (a org.eclipse.jetty.client.util.InputStreamResponseListener) at org.apache.solr.common.util.FastInputStream.readWrappedStream( FastInputStream.java:90) at org.apache.solr.common.util.FastInputStream.refill( FastInputStream.java:99) at org.apache.solr.common.util.FastInputStream.readByte( FastInputStream.java:217) at org.apache.solr.common.util.JavaBinCodec._init(JavaBinCodec.java:211) at org.apache.solr.common.util.JavaBinCodec.initRead(JavaBinCodec.java:202) at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:195) at org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse( BinaryResponseParser.java:51) at org.apache.solr.client.solrj.impl.Http2SolrClient.processErrorsAndResponse( Http2SolrClient.java:711) at org.apache.solr.client.solrj.impl.Http2SolrClient.request( Http2SolrClient.java:421) at org.apache.solr.client.solrj.impl.Http2SolrClient.request( Http2SolrClient.java:776) at org.apache.solr.client.solrj.impl.LBSolrClient.doRequest( LBSolrClient.java:369) at org.apache.solr.client.solrj.impl.LBSolrClient.request( LBSolrClient.java:297) at org.apache.solr.handler.component.HttpShardHandlerFactory.makeLoadBalancedRequest( HttpShardHandlerFactory.java:371) at org.apache.solr.handler.component.ShardRequestor.call( ShardRequestor.java:132) at org.apache.solr.handler.component.ShardRequestor.call( ShardRequestor.java:41) at java.util.concurrent.FutureTask.run(java.base@15.0.2/FutureTask.java:264) at java.util.concurrent.Executors$RunnableAdapter.call(java.base@ 15.0.2/Executors.java:515) at java.util.concurrent.FutureTask.run(java.base@15.0.2/FutureTask.java:264) at com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run( InstrumentedExecutorService.java:180) at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0( ExecutorUtil.java:218) at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$269/0x0008010566b0.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@ 15.0.2/ThreadPoolExecutor.java:1130) at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@ 15.0.2/ThreadPoolExecutor.java:630) at java.lang.Thread.run(java.base@15.0.2/Thread.java:832) On Sun, Dec 18, 2022 at 3:46 PM Dominique Bejean wrote: > Hi, > > May be a thread dump and a heap dump can help to find where and why this > request is blocked ? > May be just by finding this thread in the Solr console, you can see where > the thread is blocked ? > > Regards > > Dominique > > > Le dim. 18 déc. 2022 à 09:10, Satya Nand .invalid> > a écrit : > > > Pinging on this thread again to bring it to the top. > > > > Any idea why one request is stuck for hours in solr cloud.? > > > > On Fri, Dec 9, 2022 at 3:35 PM Satya Nand > > wrote: > > > > > Hi Ere, > > > > > > We tried executing this request again and it didn't take any time. So > it > > > is not repeatable. average response time of all the queries around this > > > period was only approx 100-200 ms. > > > > > > This was a group=true request where we get 14 groups and 5 results per > > > group. So no deep pagination. > > > > > > On Fri, Dec 9, 2022 at 2:04 PM Ere Maijala > > > wrote: > > > > > >> Hi, > > >> > > >> Are the same requests sometimes stalling and sometimes fast, or is it > > >> some particular queries that take hours? > > >> > > >> There are some things you should avoid with SolrCloud, and deep paging > > >> (i.e. a large number for the start or rows parameter) is a typical > issue > > >> (see e.g. https://yonik.com/solr/paging-and-deep-paging/ for more > > >> information). > > >
Re: Requests taking hours on solr cloud
Hi, One thing I have noticed is that if I keep these servers ideal (move request to another infra) then the searcher gets closed after a few minutes. so somehow incoming traffic is responsible for the searcher not getting closed. This particular request took almost 6 hours and only got closed when I diverted the traffic to another infra . https://drive.google.com/file/d/197QFkNNsbkhOL57lVn0EkPe6FEzKkWFL/view?usp=share_link On Thu, Dec 22, 2022 at 3:19 PM Satya Nand wrote: > Hi Dominique, > > I looked at the stack trace but I couldn't know for sure why the thread is > waiting. can anyone help me in decoding this? > > httpShardExecutor-7-thread-939362-processing-x:im-search-03-08-22_shard1_replica_p17 > r:core_node18 http: > 10.128.193.11:8985//solr//im-search-03-08-22_shard1_replica_p17//|http:10.128.99.14:8985//solr//im-search-03-08-22_shard1_replica_n1// > <http://10.128.193.11:8985//solr//im-search-03-08-22_shard1_replica_p17//%7Chttp:10.128.99.14:8985//solr//im-search-03-08-22_shard1_replica_n1//> > n:10.128.193.11:8985_solr c:im-search-03-08-22 s:shard1 [http: > 10.128.193.11:8985//solr//im-search-03-08-22_shard1_replica_p17//, http:// > //10.128.99.14:8985//solr//im-search-03-08-22_shard1_replica_n1//] > > PRIORITY : 5 > > THREAD ID : 0X7FE6180494C0 > > NATIVE ID : 0X54E3 > > NATIVE ID (DECIMAL) : 21731 > > STATE : WAITING > > stackTrace: > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(java.base@15.0.2/Native Method) > - waiting on > at java.lang.Object.wait(java.base@15.0.2/Object.java:321) > at org.eclipse.jetty.client.util.InputStreamResponseListener$Input.read( > InputStreamResponseListener.java:318) > - locked <0x00054cd27c88> (a > org.eclipse.jetty.client.util.InputStreamResponseListener) > at org.apache.solr.common.util.FastInputStream.readWrappedStream( > FastInputStream.java:90) > at org.apache.solr.common.util.FastInputStream.refill( > FastInputStream.java:99) > at org.apache.solr.common.util.FastInputStream.readByte( > FastInputStream.java:217) > at org.apache.solr.common.util.JavaBinCodec._init(JavaBinCodec.java:211) > at org.apache.solr.common.util.JavaBinCodec.initRead(JavaBinCodec.java:202 > ) > at org.apache.solr.common.util.JavaBinCodec.unmarshal( > JavaBinCodec.java:195) > at org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse( > BinaryResponseParser.java:51) > at > org.apache.solr.client.solrj.impl.Http2SolrClient.processErrorsAndResponse( > Http2SolrClient.java:711) > at org.apache.solr.client.solrj.impl.Http2SolrClient.request( > Http2SolrClient.java:421) > at org.apache.solr.client.solrj.impl.Http2SolrClient.request( > Http2SolrClient.java:776) > at org.apache.solr.client.solrj.impl.LBSolrClient.doRequest( > LBSolrClient.java:369) > at org.apache.solr.client.solrj.impl.LBSolrClient.request( > LBSolrClient.java:297) > at > org.apache.solr.handler.component.HttpShardHandlerFactory.makeLoadBalancedRequest( > HttpShardHandlerFactory.java:371) > at org.apache.solr.handler.component.ShardRequestor.call( > ShardRequestor.java:132) > at org.apache.solr.handler.component.ShardRequestor.call( > ShardRequestor.java:41) > at java.util.concurrent.FutureTask.run(java.base@ > 15.0.2/FutureTask.java:264) > at java.util.concurrent.Executors$RunnableAdapter.call(java.base@ > 15.0.2/Executors.java:515) > at java.util.concurrent.FutureTask.run(java.base@ > 15.0.2/FutureTask.java:264) > at > com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run( > InstrumentedExecutorService.java:180) > at > org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0( > ExecutorUtil.java:218) > at > org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$269/0x0008010566b0.run(Unknown > Source) > at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@ > 15.0.2/ThreadPoolExecutor.java:1130) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@ > 15.0.2/ThreadPoolExecutor.java:630) > at java.lang.Thread.run(java.base@15.0.2/Thread.java:832) > > On Sun, Dec 18, 2022 at 3:46 PM Dominique Bejean < > dominique.bej...@eolya.fr> wrote: > >> Hi, >> >> May be a thread dump and a heap dump can help to find where and why this >> request is blocked ? >> May be just by finding this thread in the Solr console, you can see where >> the thread is blocked ? >> >> Regards >> >> Dominique >> >> >> Le dim. 18 déc. 2022 à 09:10, Satya Nand > .invalid> >> a écrit : >> >> > Pinging on this thread again to bring it to the top. >> > >> > Any idea why one request is stuc
Re: Requests taking hours on solr cloud
Thanks, Shawn I will check that too. On Fri, Dec 23, 2022 at 12:42 AM Shawn Heisey wrote: > On 12/22/22 02:49, Satya Nand wrote: > > I looked at the stack trace but I couldn't know for sure why the thread > is > > waiting. can anyone help me in decoding this? > > > > > - locked <0x00054cd27c88> (a > > Search the thread dump for 0x00054cd27c88 to see what thread is > holding that lock and what other threads are waiting for that lock. > > Thanks, > Shawn >
Re: Solr 9.1 Admin page not opening
Hi, Try force reloading the page, ctrl+f5. It happened when we upgraded the solr. On Fri, Jan 13, 2023 at 9:36 AM Anuj Bhargava wrote: > Already added *SOLR_JETTY_HOST="0.0.0.0" *in solr.in.sh > But getting the following error now - > > HTTP ERROR 503 Service Unavailable > URI: /solr/ > STATUS: 503 > MESSAGE: Service Unavailable > SERVLET: - > The solr.in.sh contains - > > > > > > > > > > *SOLR_PID_DIR="/var/solr"SOLR_HOME="/var/solr/data"LOG4J_PROPS="/var/solr/log4j2.xml"SOLR_LOGS_DIR="/var/solr/logs"SOLR_PORT="8983"SOLR_TIMEZONE="UTC"SOLR_JETTY_HOST="0.0.0.0"* > > On Fri, 13 Jan 2023 at 07:07, Shawn Heisey wrote: > > > On 1/12/23 05:48, Anuj Bhargava wrote: > > > Also ran the following- > > > > > > [root@76 etc]# sudo ss -tulpn | grep 8983 > > > tcpLISTEN 0 50 [::]:8983 [::]:* > > >users:(("java",pid=17280,fd=134)) > > > > > > I think [::]:8983 should be *:8983 > > > How to get it > > > > Jan already gave you this information. > > > > > > > https://solr.apache.org/guide/solr/latest/deployment-guide/taking-solr-to-production.html#security-considerations > > > > Thanks, > > Shawn > > >
UpdateHandler merges count
Hi, I wanted to check the merges happening in my index to relate it to CPU usage. I checked out this metric UPDATE.updateHandler.merges.count on this path ?type=update&entry=updateHandler, But it returns 0. Should I be looking somewhere else?