I was gonna mention point 2 as well. Maybe you're hitting the -Xss JVM limit (256Kb) and need to manually increase it. I seem to recall that it should be at the very least 1Mb (or so it says here <https://github.com/openjdk/jdk11u/blob/master/src/hotspot/os_cpu/linux_x86/globals_linux_x86.hpp#L34> )
Maybe removing that parameter or setting it to 1024Kb may change your results On Wed, Oct 13, 2021 at 2:30 PM Deepak Goel <deic...@gmail.com> wrote: > There could be parameters which specify: > > 1. How long do old threads live? And when they can be reused. > > 2. You might also be reaching memory restrictions for creating new threads. > > 3. This is a bit offbeat thinking. But is there a lot of disk activities > happening as threads are nothing but files. > > On Wed, 13 Oct 2021, 22:30 Dominic Humphries, <domi...@adzuna.com.invalid> > wrote: > > > Sure - when solr gets pushed enough to reach the max thread count, it > > doesn't just re-use what it has, it keeps trying to create new ones. The > > logs *flood* with these errors: > > > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]: > > [5032072.842s][warning][os,thread] Failed to start thread - > pthread_create > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k, detached. > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]: > > [5032072.843s][warning][os,thread] Failed to start thread - > pthread_create > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k, detached. > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]: > > [5032072.843s][warning][os,thread] Failed to start thread - > pthread_create > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k, detached. > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]: > > [5032072.848s][warning][os,thread] Failed to start thread - > pthread_create > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k, detached. > > Oct 07 15:38:56 ip-10-40-23-148 solr[4734]: > > [5032072.852s][warning][os,thread] Failed to start thread - > pthread_create > > failed (EAGAIN) for attributes: stacksize: 256k, guardsize: 0k, detached. > > > > > > 2021-10-11 14:22:44.214 WARN (qtp178604517-3891) [ ] > > o.e.j.i.ManagedSelector => java.lang.OutOfMemoryError: unable to create > > native thread: possibly out of memory or process/resource limits reached > > at java.base/java.lang.Thread.start0(Native Method) > > java.lang.OutOfMemoryError: unable to create native thread: possibly out > of > > memory or process/resource limits reached > > at java.lang.Thread.start0(Native Method) ~[?:?] > > at java.lang.Thread.start(Thread.java:803) ~[?:?] > > at > > > > > org.eclipse.jetty.util.thread.QueuedThreadPool.startThread(QueuedThreadPool.java:787) > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516] > > at > > > > > org.eclipse.jetty.util.thread.QueuedThreadPool.execute(QueuedThreadPool.java:723) > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516] > > at > > org.eclipse.jetty.io.SelectorManager.execute(SelectorManager.java:163) > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516] > > at > > org.eclipse.jetty.io.ManagedSelector.execute(ManagedSelector.java:332) > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516] > > at > > org.eclipse.jetty.io > > .ManagedSelector.destroyEndPoint(ManagedSelector.java:413) > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516] > > at > > org.eclipse.jetty.io.ChannelEndPoint.onClose(ChannelEndPoint.java:232) > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516] > > at > > org.eclipse.jetty.io > .AbstractEndPoint.doOnClose(AbstractEndPoint.java:225) > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516] > > at > > org.eclipse.jetty.io.AbstractEndPoint.close(AbstractEndPoint.java:192) > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516] > > at > > org.eclipse.jetty.io.AbstractEndPoint.close(AbstractEndPoint.java:175) > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516] > > at > > org.eclipse.jetty.io.ChannelEndPoint.updateKey(ChannelEndPoint.java:389) > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516] > > at > > org.eclipse.jetty.io > > .ChannelEndPoint.updateKeyAction(ChannelEndPoint.java:357) > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516] > > at > > org.eclipse.jetty.io > > > .ManagedSelector$SelectorProducer.processUpdates(ManagedSelector.java:568) > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516] > > at > > org.eclipse.jetty.io > > .ManagedSelector$SelectorProducer.produce(ManagedSelector.java:539) > > ~[jetty-io-9.4.41.v20210516.jar:9.4.41.v20210516] > > at > > > > > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produceTask(EatWhatYouKill.java:360) > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516] > > at > > > > > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:184) > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516] > > at > > > > > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171) > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516] > > at > > > > > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129) > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516] > > at > > > > > org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:383) > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516] > > at > > > > > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882) > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516] > > at > > > > > org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036) > > ~[jetty-util-9.4.41.v20210516.jar:9.4.41.v20210516] > > at java.lang.Thread.run(Thread.java:834) [?:?] > > > > On Wed, 13 Oct 2021 at 17:24, Deepak Goel <deic...@gmail.com> wrote: > > > > > Am sorry, did you post the thread error.I might have missed it. Can you > > > please post it again? > > > > > > Deepak > > > "The greatness of a nation can be judged by the way its animals are > > treated > > > - Mahatma Gandhi" > > > > > > +91 73500 12833 > > > deic...@gmail.com > > > > > > Facebook: https://www.facebook.com/deicool > > > LinkedIn: www.linkedin.com/in/deicool > > > > > > "Plant a Tree, Go Green" > > > > > > Make In India : http://www.makeinindia.com/home > > > > > > > > > On Wed, Oct 13, 2021 at 9:15 PM Dominic Humphries > > > <domi...@adzuna.com.invalid> wrote: > > > > > > > Keeping threads around for re-use would certainly make sense to me, > but > > > > when the code starts erroring because it's trying to create ever-more > > new > > > > ones rather than re-use old ones, it seems there must be a bug > > > > > > > > On Wed, 13 Oct 2021 at 15:03, Deepak Goel <deic...@gmail.com> wrote: > > > > > > > > > We will have to check the solr logic. Threads waiting might get > > reused > > > in > > > > > the future, so might be good for all you know. > > > > > > > > > > Deepak > > > > > "The greatness of a nation can be judged by the way its animals are > > > > treated > > > > > - Mahatma Gandhi" > > > > > > > > > > +91 73500 12833 > > > > > deic...@gmail.com > > > > > > > > > > Facebook: https://www.facebook.com/deicool > > > > > LinkedIn: www.linkedin.com/in/deicool > > > > > > > > > > "Plant a Tree, Go Green" > > > > > > > > > > Make In India : http://www.makeinindia.com/home > > > > > > > > > > > > > > > On Wed, Oct 13, 2021 at 6:05 PM Dominic Humphries > > > > > <domi...@adzuna.com.invalid> wrote: > > > > > > > > > > > Yep - I edited the two instances to remove those differences and > > > > re-ran a > > > > > > fairly tortuous testing cycle - 100 requests/sec. > > > > > > > > > > > > Interestingly, 8.9.0 outperformed 8.3.1 significantly at this > point > > > and > > > > > > neither hit the threads limit or failed to create new threads. > But > > > both > > > > > > scaled up to well above 2k threads and even after stopping the > > test, > > > > the > > > > > > threads remained in a timed wait condition. So I'm still thinking > > > > > there's a > > > > > > problem with threads not getting terminated somewhere. Test > output > > > > below > > > > > if > > > > > > it's of use: > > > > > > > > > > > > 8.3.1 > > > > > > upload: ./831_tests_results.bin to > > > > > > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:20:25 > > > > > > Requests [total, rate, throughput] 6000, 100.02, 25.05 > > > > > > Duration [total, attack, wait] 1m29.97089157s, > > > 59.990075461s, > > > > > > 29.980816109s > > > > > > Latencies [mean, 50, 95, 99, max] 11.011900024s, > > 5.175300857s, > > > > > > 30.00091013s, 30.001077592s, 30.014343222s > > > > > > Bytes In [total, mean] 42385846, 7064.31 > > > > > > Bytes Out [total, mean] 0, 0.00 > > > > > > Success [ratio] 37.57% > > > > > > Status Codes [code:count] 0:3741 200:2254 500:5 > > > > > > Error Set: > > > > > > 500 Server Error > > > > > > Wed Oct 13 12:20:27 UTC 2021 > > > > > > upload: ./831_tests_results.bin to > > > > > > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:21:57 > > > > > > Requests [total, rate, throughput] 6000, 100.02, 10.98 > > > > > > Duration [total, attack, wait] 1m29.931457331s, > > > > 59.990859187s, > > > > > > 29.940598144s > > > > > > Latencies [mean, 50, 95, 99, max] 7.74211412s, 57.37µs, > > > > > > 30.000885208s, 30.00105703s, 30.011268185s > > > > > > Bytes In [total, mean] 21287225, 3547.87 > > > > > > Bytes Out [total, mean] 0, 0.00 > > > > > > Success [ratio] 16.45% > > > > > > Status Codes [code:count] 0:5007 200:987 500:6 > > > > > > Error Set: > > > > > > 500 Server Error > > > > > > Wed Oct 13 12:21:58 UTC 2021 > > > > > > upload: ./831_tests_results.bin to > > > > > > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:23:27 > > > > > > Requests [total, rate, throughput] 6000, 100.02, 11.68 > > > > > > Duration [total, attack, wait] 1m29.58141878s, > > > 59.990945585s, > > > > > > 29.590473195s > > > > > > Latencies [mean, 50, 95, 99, max] 9.05339281s, 58.742µs, > > > > > > 30.000880433s, 30.001052307s, 30.004624884s > > > > > > Bytes In [total, mean] 17434533, 2905.76 > > > > > > Bytes Out [total, mean] 0, 0.00 > > > > > > Success [ratio] 17.43% > > > > > > Status Codes [code:count] 0:4874 200:1046 500:80 > > > > > > Error Set: > > > > > > 500 Server Error > > > > > > context deadline exceeded (Client.Timeout or context cancellation > > > while > > > > > > reading body) > > > > > > Wed Oct 13 12:23:29 UTC 2021 > > > > > > upload: ./831_tests_results.bin to > > > > > > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:24:58 > > > > > > Requests [total, rate, throughput] 6000, 100.02, 1.70 > > > > > > Duration [total, attack, wait] 1m29.541198374s, > > > > 59.990833778s, > > > > > > 29.550364596s > > > > > > Latencies [mean, 50, 95, 99, max] 8.155874514s, 51.073µs, > > > > > > 30.000892751s, 30.001051441s, 30.009856373s > > > > > > Bytes In [total, mean] 2211649, 368.61 > > > > > > Bytes Out [total, mean] 0, 0.00 > > > > > > Success [ratio] 2.53% > > > > > > Status Codes [code:count] 0:5810 200:152 500:38 > > > > > > Error Set: > > > > > > 500 Server Error > > > > > > Wed Oct 13 12:24:59 UTC 2021 > > > > > > upload: ./831_tests_results.bin to > > > > > > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:26:29 > > > > > > Requests [total, rate, throughput] 6000, 100.02, 0.73 > > > > > > Duration [total, attack, wait] 1m29.451307081s, > > > > 59.990601937s, > > > > > > 29.460705144s > > > > > > Latencies [mean, 50, 95, 99, max] 7.257427504s, 50.866µs, > > > > > > 30.000885386s, 30.001023192s, 30.009992128s > > > > > > Bytes In [total, mean] 837808, 139.63 > > > > > > Bytes Out [total, mean] 0, 0.00 > > > > > > Success [ratio] 1.08% > > > > > > Status Codes [code:count] 0:5907 200:65 500:28 > > > > > > Error Set: > > > > > > 500 Server Error > > > > > > context deadline exceeded (Client.Timeout or context cancellation > > > while > > > > > > reading body) > > > > > > Wed Oct 13 12:26:30 UTC 2021 > > > > > > upload: ./831_tests_results.bin to > > > > > > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:28:00 > > > > > > Requests [total, rate, throughput] 6000, 100.02, 1.65 > > > > > > Duration [total, attack, wait] 1m29.760671622s, > > > > 59.990482467s, > > > > > > 29.770189155s > > > > > > Latencies [mean, 50, 95, 99, max] 8.288506559s, 50.424µs, > > > > > > 30.000894904s, 30.00104114s, 30.016506845s > > > > > > Bytes In [total, mean] 1973103, 328.85 > > > > > > Bytes Out [total, mean] 0, 0.00 > > > > > > Success [ratio] 2.47% > > > > > > Status Codes [code:count] 0:5821 200:148 500:31 > > > > > > Error Set: > > > > > > 500 Server Error > > > > > > > > > > > > > > > > > > 8.9.0 > > > > > > upload: ./890_tests_results.bin to > > > > > > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:20:27 > > > > > > Requests [total, rate, throughput] 6000, 100.02, 28.50 > > > > > > Duration [total, attack, wait] 1m29.930332927s, > > > > 59.990809535s, > > > > > > 29.939523392s > > > > > > Latencies [mean, 50, 95, 99, max] 10.958871049s, > > 5.184437078s, > > > > > > 30.000885573s, 30.001088926s, 30.00946393s > > > > > > Bytes In [total, mean] 51095634, 8515.94 > > > > > > Bytes Out [total, mean] 0, 0.00 > > > > > > Success [ratio] 42.72% > > > > > > Status Codes [code:count] 0:3432 200:2563 500:5 > > > > > > Error Set: > > > > > > 500 Server Error > > > > > > Wed Oct 13 12:20:28 UTC 2021 > > > > > > upload: ./890_tests_results.bin to > > > > > > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:21:58 > > > > > > Requests [total, rate, throughput] 6000, 100.02, 38.15 > > > > > > Duration [total, attack, wait] 1m29.951119527s, > > > > 59.990310608s, > > > > > > 29.960808919s > > > > > > Latencies [mean, 50, 95, 99, max] 10.089597212s, > > 4.856237339s, > > > > > > 30.000863563s, 30.001063417s, 30.013413816s > > > > > > Bytes In [total, mean] 64455304, 10742.55 > > > > > > Bytes Out [total, mean] 0, 0.00 > > > > > > Success [ratio] 57.20% > > > > > > Status Codes [code:count] 0:2565 200:3432 500:3 > > > > > > Error Set: > > > > > > 500 Server Error > > > > > > Wed Oct 13 12:22:00 UTC 2021 > > > > > > upload: ./890_tests_results.bin to > > > > > > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:23:30 > > > > > > Requests [total, rate, throughput] 6000, 100.02, 31.12 > > > > > > Duration [total, attack, wait] 1m29.971513653s, > > > > 59.990882146s, > > > > > > 29.980631507s > > > > > > Latencies [mean, 50, 95, 99, max] 7.982882635s, > > 4.086936886s, > > > > > > 30.000611417s, 30.001022921s, 30.00949772s > > > > > > Bytes In [total, mean] 49838902, 8306.48 > > > > > > Bytes Out [total, mean] 0, 0.00 > > > > > > Success [ratio] 46.67% > > > > > > Status Codes [code:count] 0:3011 200:2800 500:4 > > > > 503:185 > > > > > > Error Set: > > > > > > 500 Server Error > > > > > > unexpected EOF > > > > > > 503 Service Unavailable > > > > > > Wed Oct 13 12:23:31 UTC 2021 > > > > > > upload: ./890_tests_results.bin to > > > > > > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:25:01 > > > > > > Requests [total, rate, throughput] 6000, 100.02, 30.21 > > > > > > Duration [total, attack, wait] 1m29.871026977s, > > > > 59.990066513s, > > > > > > 29.880960464s > > > > > > Latencies [mean, 50, 95, 99, max] 9.187948603s, > > 5.166973696s, > > > > > > 30.000723538s, 30.001043637s, 30.009522646s > > > > > > Bytes In [total, mean] 52508950, 8751.49 > > > > > > Bytes Out [total, mean] 0, 0.00 > > > > > > Success [ratio] 45.25% > > > > > > Status Codes [code:count] 0:3118 200:2715 500:4 > > > > 503:163 > > > > > > Error Set: > > > > > > 500 Server Error > > > > > > unexpected EOF > > > > > > 503 Service Unavailable > > > > > > Wed Oct 13 12:25:02 UTC 2021 > > > > > > upload: ./890_tests_results.bin to > > > > > > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:26:33 > > > > > > Requests [total, rate, throughput] 6000, 100.02, 28.33 > > > > > > Duration [total, attack, wait] 1m29.931899376s, > > > > 59.990583341s, > > > > > > 29.941316035s > > > > > > Latencies [mean, 50, 95, 99, max] 10.643567102s, > > 4.009658048s, > > > > > > 30.000888301s, 30.00107338s, 30.010319307s > > > > > > Bytes In [total, mean] 51869384, 8644.90 > > > > > > Bytes Out [total, mean] 0, 0.00 > > > > > > Success [ratio] 42.47% > > > > > > Status Codes [code:count] 0:3448 200:2548 500:4 > > > > > > Error Set: > > > > > > 500 Server Error > > > > > > context deadline exceeded (Client.Timeout or context cancellation > > > while > > > > > > reading body) > > > > > > Wed Oct 13 12:26:34 UTC 2021 > > > > > > upload: ./890_tests_results.bin to > > > > > > > > s3://adzuna-files-stage/test_framework/solr/tests/2021-10-13:12:28:03 > > > > > > Requests [total, rate, throughput] 6000, 100.02, 40.46 > > > > > > Duration [total, attack, wait] 1m29.540373624s, > > > > 59.990568262s, > > > > > > 29.549805362s > > > > > > Latencies [mean, 50, 95, 99, max] 9.938485443s, > > 5.811191827s, > > > > > > 30.000761953s, 30.001041833s, 30.014534484s > > > > > > Bytes In [total, mean] 64483027, 10747.17 > > > > > > Bytes Out [total, mean] 0, 0.00 > > > > > > Success [ratio] 60.38% > > > > > > Status Codes [code:count] 0:2372 200:3623 500:5 > > > > > > Error Set: > > > > > > 500 Server Error > > > > > > > > > > > > > > > > > > > > > > > > On Wed, 13 Oct 2021 at 13:06, Deepak Goel <deic...@gmail.com> > > wrote: > > > > > > > > > > > > > Hello > > > > > > > > > > > > > > I can as of now see two changes: > > > > > > > > > > > > > > 1. -Xmx > > > > > > > > > > > > > > 2. +ExplicitGCInvokesConcurrent > > > > > > > > > > > > > > Deepak > > > > > > > "The greatness of a nation can be judged by the way its animals > > are > > > > > > treated > > > > > > > - Mahatma Gandhi" > > > > > > > > > > > > > > +91 73500 12833 > > > > > > > deic...@gmail.com > > > > > > > > > > > > > > Facebook: https://www.facebook.com/deicool > > > > > > > LinkedIn: www.linkedin.com/in/deicool > > > > > > > > > > > > > > "Plant a Tree, Go Green" > > > > > > > > > > > > > > Make In India : http://www.makeinindia.com/home > > > > > > > > > > > > > > > > > > > > > On Wed, Oct 13, 2021 at 5:09 PM Dominic Humphries > > > > > > > <domi...@adzuna.com.invalid> wrote: > > > > > > > > > > > > > > > CLI invocation for 8.3.1 is > > > > > > > > java -server -Xmx15826m -XX:+UseG1GC > -XX:+PerfDisableSharedMem > > > > > > > > -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=250 > > > > > -XX:+UseLargePages > > > > > > > > -XX:+AlwaysPreTouch > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M > > > > > > > > -Dcom.sun.management.jmxremote > > > > > > > > -Dcom.sun.management.jmxremote.local.only=false > > > > > > > > -Dcom.sun.management.jmxremote.ssl=false > > > > > > > > -Dcom.sun.management.jmxremote.authenticate=false > > > > > > > > -Dcom.sun.management.jmxremote.port=18983 > > > > > > > > -Dcom.sun.management.jmxremote.rmi.port=18983 > > > > > > > -Dsolr.log.dir=/srv/solr/logs > > > > > > > > -Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks > > > > > > > -Duser.timezone=UTC > > > > > > > > -Djetty.home=/usr/local/solr/server > > > -Dsolr.solr.home=/srv/solr/data > > > > > > > > -Dsolr.data.home= -Dsolr.install.dir=/usr/local/solr > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -Dsolr.default.confdir=/usr/local/solr/server/solr/configsets/_default/conf > > > > > > > > -Dlog4j.configurationFile=file:/srv/solr/log4j2.xml > > > > > > > > -Dsolr.disable.shardsWhitelist=true -Xss256k > > > > > > -Dsolr.jetty.https.port=8983 > > > > > > > > -jar start.jar --module=http > > > > > > > > I believe the key items are: > > > > > > > > -XX:+AlwaysPreTouch > > > > > > > > -XX:+ParallelRefProcEnabled > > > > > > > > -XX:+PerfDisableSharedMem > > > > > > > > -XX:+UseG1GC > > > > > > > > -XX:+UseLargePages > > > > > > > > -XX:MaxGCPauseMillis=250 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M > > > > > > > > -Xmx15826m > > > > > > > > -Xss256k > > > > > > > > > > > > > > > > And for 8.9.0 is > > > > > > > > java -server -Xmx7913m -XX:+UseG1GC -XX:+PerfDisableSharedMem > > > > > > > > -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=250 > > > > > -XX:+UseLargePages > > > > > > > > -XX:+AlwaysPreTouch -XX:+ExplicitGCInvokesConcurrent > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M > > > > > > > > -Dsolr.jetty.inetaccess.includes= > > > -Dsolr.jetty.inetaccess.excludes= > > > > > > > > -Dcom.sun.management.jmxremote > > > > > > > > -Dcom.sun.management.jmxremote.local.only=false > > > > > > > > -Dcom.sun.management.jmxremote.ssl=false > > > > > > > > -Dcom.sun.management.jmxremote.authenticate=false > > > > > > > > -Dcom.sun.management.jmxremote.port=18983 > > > > > > > > -Dcom.sun.management.jmxremote.rmi.port=18983 > > > > > > > -Dsolr.log.dir=/srv/solr/logs > > > > > > > > -Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks > > > > > > > -Duser.timezone=UTC > > > > > > > > -XX:-OmitStackTraceInFastThrow > > > > > > > > -XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh 8983 > > > > > > > /srv/solr/logs > > > > > > > > -Djetty.home=/usr/local/solr/server > > > -Dsolr.solr.home=/srv/solr/data > > > > > > > > -Dsolr.data.home= -Dsolr.install.dir=/usr/local/solr > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -Dsolr.default.confdir=/usr/local/solr/server/solr/configsets/_default/conf > > > > > > > > -Dlog4j.configurationFile=/srv/solr/log4j2.xml > > > > > > > > -Dsolr.disable.shardsWhitelist=true -Xss256k -jar start.jar > > > > > > --module=http > > > > > > > > Key: > > > > > > > > -XX:+AlwaysPreTouch > > > > > > > > -XX:+ExplicitGCInvokesConcurrent > > > > > > > > -XX:+ParallelRefProcEnabled > > > > > > > > -XX:+PerfDisableSharedMem > > > > > > > > -XX:+UseG1GC > > > > > > > > -XX:+UseLargePages > > > > > > > > -XX:-OmitStackTraceInFastThrow > > > > > > > > -XX:MaxGCPauseMillis=250 > > > > > > > > -XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh 8983 > > > > > > > /srv/solr/logs > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -Xlog:gc*:file=/srv/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M > > > > > > > > -Xmx7913m > > > > > > > > -Xss256k > > > > > > > > > > > > > > > > Xmx values are based on the instance RAM, currently they're > > > running > > > > > on > > > > > > > two > > > > > > > > different instance types but we see the same behaviour when > > > they're > > > > > on > > > > > > > > identical types too. > > > > > > > > > > > > > > > > Many thanks > > > > > > > > > > > > > > > > Dominic > > > > > > > > > > > > > > > > On Wed, 13 Oct 2021 at 12:07, Deepak Goel <deic...@gmail.com > > > > > > wrote: > > > > > > > > > > > > > > > > > Hello > > > > > > > > > > > > > > > > > > Can you please tell us the JVM Heap Setting for both the > > > > versions: > > > > > > > 8.3.1, > > > > > > > > > 8.9.0? > > > > > > > > > > > > > > > > > > I will also have to look into the following code: > > > > > > > > FileFloatSource.java:210. > > > > > > > > > (will do it tonite-IST and update) > > > > > > > > > > > > > > > > > > Deepak > > > > > > > > > "The greatness of a nation can be judged by the way its > > animals > > > > are > > > > > > > > treated > > > > > > > > > - Mahatma Gandhi" > > > > > > > > > > > > > > > > > > +91 73500 12833 > > > > > > > > > deic...@gmail.com > > > > > > > > > > > > > > > > > > Facebook: https://www.facebook.com/deicool > > > > > > > > > LinkedIn: www.linkedin.com/in/deicool > > > > > > > > > > > > > > > > > > "Plant a Tree, Go Green" > > > > > > > > > > > > > > > > > > Make In India : http://www.makeinindia.com/home > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Oct 13, 2021 at 4:06 PM Dominic Humphries > > > > > > > > > <domi...@adzuna.com.invalid> wrote: > > > > > > > > > > > > > > > > > > > Oh, that's very helpful to know about, ty > > > > > > > > > > > > > > > > > > > > The overwhelming majority appear to be threads in > > > > TIMED_WAITING, > > > > > > all > > > > > > > > > > waiting on the same > > > > > > > > > > thing: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@3b315cbb > > > > > > > > > > > > > > > > > > > > I've attached a screenshot which includes the stack > trace. > > > > > Stopping > > > > > > > all > > > > > > > > > > queries to the instance and waiting didn't result in any > > > > > noticeable > > > > > > > > > > decrease in the number of threads so it looks like > despite > > > > being > > > > > > > timed, > > > > > > > > > > they're simply not getting terminated. > > > > > > > > > > > > > > > > > > > > Restarting the service takes me back down to just 53 > > threads; > > > > > > > > re-running > > > > > > > > > a > > > > > > > > > > test results in many new threads immediately coming into > > > being, > > > > > > this > > > > > > > > time > > > > > > > > > > with a higher proportion of threads BLOCKED on > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de > > > > > > > > > > - See second screenshot. The stack trace for those is too > > big > > > > for > > > > > > one > > > > > > > > > > screen so here's the output: > > > > > > > > > > > > > > > > > > > > qtp178604517-861 (861) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.solr.search.function.FileFloatSource$CreationPlaceholder@37b782de > > > > > > > > > > > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.solr.search.function.FileFloatSource$Cache.get(FileFloatSource.java:210) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.solr.search.function.FileFloatSource.getCachedFloats(FileFloatSource.java:158) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.solr.search.function.FileFloatSource.getValues(FileFloatSource.java:97) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.lucene.queries.function.ValueSource$WrappedDoubleValuesSource.getValues(ValueSource.java:203) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.lucene.queries.function.FunctionScoreQuery$MultiplicativeBoostValuesSource.getValues(FunctionScoreQuery.java:261) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.lucene.queries.function.FunctionScoreQuery$FunctionScoreWeight.scorer(FunctionScoreQuery.java:224) > > > > > > > > > > - > > > > > > org.apache.lucene.search.Weight.scorerSupplier(Weight.java:148) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.lucene.search.BooleanWeight.scorerSupplier(BooleanWeight.java:379) > > > > > > > > > > - > > > > > > > > > > > > > > > org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:344) > > > > > > > > > > - > > > > org.apache.lucene.search.Weight.bulkScorer(Weight.java:182) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:338) > > > > > > > > > > - > > > > > > > > > > > > > > > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:656) > > > > > > > > > > - > > > > > > > > > > > > > > > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:211) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.solr.search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java:1705) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1408) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:596) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1500) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:390) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:369) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:216) > > > > > > > > > > - > > > org.apache.solr.core.SolrCore.execute(SolrCore.java:2637) > > > > > > > > > > - > > > > > > > > > > > org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:794) > > > > > > > > > > - > > > > > > org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:567) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:357) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:201) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1435) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1350) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:177) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) > > > > > > > > > > - > > org.eclipse.jetty.server.Server.handle(Server.java:516) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.server.HttpChannel$$Lambda$556/0x000000080067a440.dispatch(Unknown > > > > > > > > > > Source) > > > > > > > > > > - > > > > > > > > > > > org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633) > > > > > > > > > > - > > > > > > > > org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277) > > > > > > > > > > - > > > > > > > > > > org.eclipse.jetty.io > > > > > > > > > > > > > > > > > > .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) > > > > > > > > > > - org.eclipse.jetty.io > > > > > > > .FillInterest.fillable(FillInterest.java:105) > > > > > > > > > > - org.eclipse.jetty.io > > > > > > > > > .ChannelEndPoint$1.run(ChannelEndPoint.java:104) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:383) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882) > > > > > > > > > > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036) > > > > > > > > > > - java.base@11.0.5 > > /java.lang.Thread.run(Thread.java:834) > > > > > > > > > > > > > > > > > > > > [image: image.png] > > > > > > > > > > [image: image.png] > > > > > > > > > > > > > > > > > > > > On Wed, 13 Oct 2021 at 00:03, Joel Bernstein < > > > > joels...@gmail.com > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > >> There is a thread dump on the Solr admin. You can use > that > > > to > > > > > > > > determine > > > > > > > > > >> what all those threads are doing and where they are > > getting > > > > > stuck. > > > > > > > You > > > > > > > > > can > > > > > > > > > >> post parts of the thread dump back to this email thread > as > > > > well. > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> Joel Bernstein > > > > > > > > > >> http://joelsolr.blogspot.com/ > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> On Tue, Oct 12, 2021 at 11:15 AM Dominic Humphries > > > > > > > > > >> <domi...@adzuna.com.invalid> wrote: > > > > > > > > > >> > > > > > > > > > >> > We run 8.3.1 in prod without any problems, but we're > > > having > > > > > > issues > > > > > > > > > with > > > > > > > > > >> > trying to upgrade. > > > > > > > > > >> > > > > > > > > > > >> > I've created an 8.9.0 leader & follower, imported our > > live > > > > > data > > > > > > > into > > > > > > > > > it, > > > > > > > > > >> > and am testing it via replaying requests made to prod. > > > We're > > > > > > > seeing > > > > > > > > a > > > > > > > > > >> big > > > > > > > > > >> > problem where fairly moderate request rates are > causing > > > the > > > > > > > instance > > > > > > > > > to > > > > > > > > > >> > become so slow it fails healthcheck. The logs showed a > > lot > > > > of > > > > > > > errors > > > > > > > > > >> around > > > > > > > > > >> > creating threads: > > > > > > > > > >> > > > > > > > > > > >> > solr[4507]: [124136.511s][warning][os,thread] Failed > to > > > > start > > > > > > > > thread - > > > > > > > > > >> > pthread_create failed (EAGAIN) for attributes: > > stacksize: > > > > > 256k, > > > > > > > > > >> guardsize: > > > > > > > > > >> > 0k, detached. > > > > > > > > > >> > > > > > > > > > > >> > WARN (qtp178604517-3891) [ ] > o.e.j.i.ManagedSelector > > > => > > > > > > > > > >> > java.lang.OutOfMemoryError: unable to create native > > > thread: > > > > > > > possibly > > > > > > > > > >> out of > > > > > > > > > >> > memory or process/resource limits reached > > > > > > > > > >> > > > > > > > > > > >> > So I monitored thread count for the process whilst > > running > > > > the > > > > > > > test > > > > > > > > > >> suite > > > > > > > > > >> > and saw a persistent pattern: Threads increased until > > > maxed > > > > > out, > > > > > > > the > > > > > > > > > >> logs > > > > > > > > > >> > flooded with errors as it tried to create still more > > > > threads, > > > > > > and > > > > > > > > the > > > > > > > > > >> > instance slowed down until terminated as unhealthy. > > > > > > > > > >> > > > > > > > > > > >> > The DefaultTasksMax is set to 4915, I've tried raising > > and > > > > > > > lowering > > > > > > > > it > > > > > > > > > >> but > > > > > > > > > >> > regardless of value the result is the same: it gets > > maxed > > > > and > > > > > > > > > everything > > > > > > > > > >> > slows down. > > > > > > > > > >> > > > > > > > > > > >> > Is there anything I can do to stop solr spinning up so > > > many > > > > > > > threads > > > > > > > > it > > > > > > > > > >> > ceases to function? There have been a few test passes > > > where > > > > it > > > > > > > > > >> > spontaneously dropped threadcount from thousands to > > > hundreds > > > > > and > > > > > > > > > stayed > > > > > > > > > >> up > > > > > > > > > >> > longer, but there seems no pattern to when this > happens. > > > > > Running > > > > > > > the > > > > > > > > > >> same > > > > > > > > > >> > tests on 8.3.1 results in a much slower increase in > > > threads > > > > > and > > > > > > it > > > > > > > > > never > > > > > > > > > >> > quite maxes them so things continue to function. > > > > > > > > > >> > > > > > > > > > > >> > See below for the thread count and healthcheck times > > seen > > > > on a > > > > > > > > (fairly > > > > > > > > > >> > harsh) test run of 100 requests/sec > > > > > > > > > >> > > > > > > > > > > >> > Thanks > > > > > > > > > >> > > > > > > > > > > >> > Dominic > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > Threadcount: > > > > > > > > > >> > > > > > > > > > > >> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date; ps > > -eLF | > > > > > grep > > > > > > > > > >> 'start.jar' > > > > > > > > > >> > | wc -l; sleep 10s; done > > > > > > > > > >> > Tue Oct 12 14:27:33 UTC 2021 > > > > > > > > > >> > 52 > > > > > > > > > >> > Tue Oct 12 14:27:43 UTC 2021 > > > > > > > > > >> > 52 > > > > > > > > > >> > Tue Oct 12 14:27:54 UTC 2021 > > > > > > > > > >> > 52 > > > > > > > > > >> > Tue Oct 12 14:28:04 UTC 2021 > > > > > > > > > >> > 52 > > > > > > > > > >> > Tue Oct 12 14:28:14 UTC 2021 > > > > > > > > > >> > 569 > > > > > > > > > >> > Tue Oct 12 14:28:24 UTC 2021 > > > > > > > > > >> > 899 > > > > > > > > > >> > Tue Oct 12 14:28:34 UTC 2021 > > > > > > > > > >> > 1198 > > > > > > > > > >> > Tue Oct 12 14:28:44 UTC 2021 > > > > > > > > > >> > 1589 > > > > > > > > > >> > Tue Oct 12 14:28:54 UTC 2021 > > > > > > > > > >> > 2016 > > > > > > > > > >> > Tue Oct 12 14:29:05 UTC 2021 > > > > > > > > > >> > 2451 > > > > > > > > > >> > Tue Oct 12 14:29:15 UTC 2021 > > > > > > > > > >> > 2851 > > > > > > > > > >> > Tue Oct 12 14:29:26 UTC 2021 > > > > > > > > > >> > 2934 > > > > > > > > > >> > Tue Oct 12 14:29:36 UTC 2021 > > > > > > > > > >> > 3249 > > > > > > > > > >> > Tue Oct 12 14:29:46 UTC 2021 > > > > > > > > > >> > 3501 > > > > > > > > > >> > Tue Oct 12 14:29:57 UTC 2021 > > > > > > > > > >> > 3734 > > > > > > > > > >> > Tue Oct 12 14:30:07 UTC 2021 > > > > > > > > > >> > 4128 > > > > > > > > > >> > Tue Oct 12 14:30:18 UTC 2021 > > > > > > > > > >> > 4374 > > > > > > > > > >> > Tue Oct 12 14:30:29 UTC 2021 > > > > > > > > > >> > 4637 > > > > > > > > > >> > Tue Oct 12 14:30:39 UTC 2021 > > > > > > > > > >> > 4693 > > > > > > > > > >> > Tue Oct 12 14:30:50 UTC 2021 > > > > > > > > > >> > 4807 > > > > > > > > > >> > Tue Oct 12 14:31:01 UTC 2021 > > > > > > > > > >> > 4916 > > > > > > > > > >> > Tue Oct 12 14:31:11 UTC 2021 > > > > > > > > > >> > 4916 > > > > > > > > > >> > Tue Oct 12 14:31:22 UTC 2021 > > > > > > > > > >> > Connection to 10.40.22.166 closed by remote host. > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > Healthcheck: > > > > > > > > > >> > > > > > > > > > > >> > ubuntu@ip-10-40-22-166:~$ while [ 1 ]; do date; curl > -v > > > > > > > > > >> > localhost:8983/solr/ 2>&1 | grep HTTP; date; echo > > '----'; > > > > > sleep > > > > > > > > > >> > 10s; done > > > > > > > > > >> > Tue Oct 12 14:27:34 UTC 2021 > > > > > > > > > >> > > GET /solr/ HTTP/1.1 > > > > > > > > > >> > < HTTP/1.1 200 OK > > > > > > > > > >> > Tue Oct 12 14:27:34 UTC 2021 > > > > > > > > > >> > ---- > > > > > > > > > >> > Tue Oct 12 14:27:44 UTC 2021 > > > > > > > > > >> > > GET /solr/ HTTP/1.1 > > > > > > > > > >> > < HTTP/1.1 200 OK > > > > > > > > > >> > Tue Oct 12 14:27:44 UTC 2021 > > > > > > > > > >> > ---- > > > > > > > > > >> > Tue Oct 12 14:27:54 UTC 2021 > > > > > > > > > >> > > GET /solr/ HTTP/1.1 > > > > > > > > > >> > < HTTP/1.1 200 OK > > > > > > > > > >> > Tue Oct 12 14:27:54 UTC 2021 > > > > > > > > > >> > ---- > > > > > > > > > >> > Tue Oct 12 14:28:04 UTC 2021 > > > > > > > > > >> > > GET /solr/ HTTP/1.1 > > > > > > > > > >> > < HTTP/1.1 200 OK > > > > > > > > > >> > Tue Oct 12 14:28:04 UTC 2021 > > > > > > > > > >> > ---- > > > > > > > > > >> > Tue Oct 12 14:28:14 UTC 2021 > > > > > > > > > >> > > GET /solr/ HTTP/1.1 > > > > > > > > > >> > 0 0 0 0 0 0 0 0 --:--:-- > > > > > 0:00:02 > > > > > > > > > --:--:-- > > > > > > > > > >> > 0< HTTP/1.1 200 OK > > > > > > > > > >> > Tue Oct 12 14:28:16 UTC 2021 > > > > > > > > > >> > ---- > > > > > > > > > >> > Tue Oct 12 14:28:26 UTC 2021 > > > > > > > > > >> > > GET /solr/ HTTP/1.1 > > > > > > > > > >> > 0 0 0 0 0 0 0 0 --:--:-- > > > > > 0:00:12 > > > > > > > > > --:--:-- > > > > > > > > > >> > 0< HTTP/1.1 200 OK > > > > > > > > > >> > Tue Oct 12 14:28:39 UTC 2021 > > > > > > > > > >> > ---- > > > > > > > > > >> > Tue Oct 12 14:28:49 UTC 2021 > > > > > > > > > >> > 0 0 0 0 0 0 0 0 --:--:-- > > > > > 0:00:01 > > > > > > > > > --:--:-- > > > > > > > > > >> > 0> GET /solr/ HTTP/1.1 > > > > > > > > > >> > 0 0 0 0 0 0 0 0 --:--:-- > > > > > 0:00:23 > > > > > > > > > --:--:-- > > > > > > > > > >> > 0< HTTP/1.1 200 OK > > > > > > > > > >> > Tue Oct 12 14:29:13 UTC 2021 > > > > > > > > > >> > ---- > > > > > > > > > >> > Tue Oct 12 14:29:23 UTC 2021 > > > > > > > > > >> > 0 0 0 0 0 0 0 0 --:--:-- > > > > > 0:00:01 > > > > > > > > > --:--:-- > > > > > > > > > >> > 0> GET /solr/ HTTP/1.1 > > > > > > > > > >> > < HTTP/1.1 200 OK > > > > > > > > > >> > Tue Oct 12 14:29:25 UTC 2021 > > > > > > > > > >> > ---- > > > > > > > > > >> > Tue Oct 12 14:29:35 UTC 2021 > > > > > > > > > >> > 0 0 0 0 0 0 0 0 --:--:-- > > > > > 0:00:03 > > > > > > > > > --:--:-- > > > > > > > > > >> > 0> GET /solr/ HTTP/1.1 > > > > > > > > > >> > 0 0 0 0 0 0 0 0 --:--:-- > > > > > 0:00:09 > > > > > > > > > --:--:-- > > > > > > > > > >> > 0< HTTP/1.1 200 OK > > > > > > > > > >> > Tue Oct 12 14:29:44 UTC 2021 > > > > > > > > > >> > ---- > > > > > > > > > >> > Tue Oct 12 14:29:54 UTC 2021 > > > > > > > > > >> > > GET /solr/ HTTP/1.1 > > > > > > > > > >> > 0 0 0 0 0 0 0 0 --:--:-- > > > > > 0:00:11 > > > > > > > > > --:--:-- > > > > > > > > > >> > 0< HTTP/1.1 200 OK > > > > > > > > > >> > Tue Oct 12 14:30:06 UTC 2021 > > > > > > > > > >> > ---- > > > > > > > > > >> > Tue Oct 12 14:30:16 UTC 2021 > > > > > > > > > >> > > GET /solr/ HTTP/1.1 > > > > > > > > > >> > 0 0 0 0 0 0 0 0 --:--:-- > > > > > 0:00:03 > > > > > > > > > --:--:-- > > > > > > > > > >> > 0< HTTP/1.1 200 OK > > > > > > > > > >> > Tue Oct 12 14:30:20 UTC 2021 > > > > > > > > > >> > ---- > > > > > > > > > >> > Tue Oct 12 14:30:30 UTC 2021 > > > > > > > > > >> > > GET /solr/ HTTP/1.1 > > > > > > > > > >> > 0 0 0 0 0 0 0 0 --:--:-- > > > > > 0:00:02 > > > > > > > > > --:--:-- > > > > > > > > > >> > 0< HTTP/1.1 200 OK > > > > > > > > > >> > Tue Oct 12 14:30:33 UTC 2021 > > > > > > > > > >> > ---- > > > > > > > > > >> > Tue Oct 12 14:30:43 UTC 2021 > > > > > > > > > >> > > GET /solr/ HTTP/1.1 > > > > > > > > > >> > < HTTP/1.1 200 OK > > > > > > > > > >> > Tue Oct 12 14:30:43 UTC 2021 > > > > > > > > > >> > ---- > > > > > > > > > >> > Tue Oct 12 14:30:53 UTC 2021 > > > > > > > > > >> > > GET /solr/ HTTP/1.1 > > > > > > > > > >> > Tue Oct 12 14:30:55 UTC 2021 > > > > > > > > > >> > ---- > > > > > > > > > >> > Tue Oct 12 14:31:05 UTC 2021 > > > > > > > > > >> > > GET /solr/ HTTP/1.1 > > > > > > > > > >> > < HTTP/1.1 200 OK > > > > > > > > > >> > Tue Oct 12 14:31:05 UTC 2021 > > > > > > > > > >> > ---- > > > > > > > > > >> > Tue Oct 12 14:31:15 UTC 2021 > > > > > > > > > >> > > GET /solr/ HTTP/1.1 > > > > > > > > > >> > < HTTP/1.1 200 OK > > > > > > > > > >> > Tue Oct 12 14:31:15 UTC 2021 > > > > > > > > > >> > ---- > > > > > > > > > >> > Connection to 10.40.22.166 closed by remote host. > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >