Are the response times the same for the 2 machines? Or is Solr8 faster than Solr6?
Deepak "The greatness of a nation can be judged by the way its animals are treated - Mahatma Gandhi" +91 73500 12833 deic...@gmail.com Facebook: https://www.facebook.com/deicool LinkedIn: www.linkedin.com/in/deicool "Plant a Tree, Go Green" Make In India : http://www.makeinindia.com/home On Fri, Aug 26, 2022 at 7:02 PM Shawn Heisey <apa...@elyograg.org.invalid> wrote: > On 8/26/22 02:55, Sidharth Negi wrote: > > We set up Solr 6 and Solr 8 on two identical AWS instances (16 cores, > > 128 GB of which Solr was given Xmx=50GB) and indexed the same data on > > them and tested under the same load of traffic. The schema and > > solrconfig.xml are exactly identical - the schema file is just renamed > > as managed-schema in Solr 8. None of the two machines are indexing > > data or taking replication and both have about equal number of > > segments (42 and 45 segments for Solr 6 and Solr 8 respectively) > > Are you really sure that the heap needs to be that big? It really is > huge, and due to the way that Java works, anything 32GB or larger > requires 64-bit pointers. So a heap size of 31GB actually has more > memory available than a heap size of 32GB. At 50 GB, you have likely > passed the break-even point. But unless you're dealing with hundreds of > millions of documents, it is very unlikely that you need a heap that big. > > > What's surprising is that Solr 6.6.1 CPU usage is considerably lower > > than Solr 8.11.2. Just look at the screenshot attached. The blue line > > is Solr 8.11.2 while the orange one is Solr 6.6.1. Note that the Solr > > 8 CPU usage is considerably higher with identical traffic. > > You have higher CPU usage, but does Solr 8 actually perform worse than > Solr 6? What do other metrics show, like CPU iowait percentage? > > You've talked about segment counts, but haven't talked about index > size. Is the total disk space consumed by the index about the same on > both? > > I can think of two differences between 6 and 8 that are fundamental: > First: 6 uses CMS for garbage collection and 8 uses G1. G1 has better > overall performance because more of its work can function in parallel > with the application, and I can imagine that it uses a little bit more > of resources like memory and CPU. Second: 6 uses log4j 1 and 8 uses > log4j 2. The later logging library is much faster because it takes > advantage of threads, which could increase the overall CPU usage. > Whether that would cause a significant impact depends mostly on how busy > the server is and whether the logging configuration has been changed. > With default settings, at least one log message is created for almost > every request that Solr receives. > > There have also been a lot of advancements in other areas, and those > probably contribute. Higher CPU usage does not automatically mean that > performance is worse. Sometimes applications actually perform better > when using more CPU. > > Thanks, > Shawn > >