[ https://issues.apache.org/jira/browse/SOLR-16531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17631183#comment-17631183 ]
Jason Gerlowski commented on SOLR-16531: ---------------------------------------- I can take a look at this sometime prior to 9.2. Overall I'd expect the introduction of JAX-RS to have at least some impact on core-load (which would affect restarts transitively). I wouldn't've expected it to be dramatic, but that shows what I know I guess. I'm still a little unclear on what the performance test is actually doing here, and what the impact would be on an average cluster. I have some guesses that I'm not too sure of; I'll do a little "thinking-out-loud" below, and maybe [~ichattopadhyaya] or others can correct me where I err. # So the screenshot and associated webpage show that the red line jumps from a little over 325-ish on the prior commit (3ceae7 - "Pin OS of docker image...") to a little over 350-ish with the JAX-RS commit. (Do you have access to the specific numbers there, Ishan?). So the delta of whatever this test is doing is ~25s. # The performance test involves two tasks. According to the cluster-test.json file linked above, the first task involves collection creation (solely), and the second task is a restart of each node after all the collection creation is done. # Again, going from cluster-test.json, it looks like task 1 creates 1000 collections, but doesn't specify how many shards or replicas each collection has? Does that mean 1s, 1r, or are there other defaults? # Going from cluster-test.json, the cluster either has 8 or 7 nodes (not sure how to understand/reconcile the properties [here|https://github.com/fullstorydev/solr-bench/blob/ishan/repeatable-jenkins/suites/cluster-test.json#L49] and [here|https://github.com/fullstorydev/solr-bench/blob/ishan/repeatable-jenkins/suites/cluster-test.json#L35].). # Assuming 8 nodes total going forward, each node would host roughly 1000 * replicasPerShard * shardsPerCollection / 8 cores. Or 125 * replicasPerShard * shardsPerCollection. # Now, "task 2" itself involves restarting these loaded nodes 2 at a time and waiting for everything to be healthy between batches of restarts. If each node is restarted once, that means "task 2" would kick off restarts 4 times (again, doing 2 in parallel each time). # So the "before" performance of ~325s translates to a restart of a node with 125 * replicasPerShard * shardsPerCollection total cores taking about 81s... # And the "after" performance of ~350s translates to a similar restart now taking about 87s # So, ultimately, this perf test is telling us that JAX-RS makes restarts of heavily loaded nodes take ~7-8% longer i.e. (87.5 - 81.25)/81.25 How much of that did I get right vs wrong? > Performance degradation due to introduction of JAX-RS > ----------------------------------------------------- > > Key: SOLR-16531 > URL: https://issues.apache.org/jira/browse/SOLR-16531 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Ishan Chattopadhyaya > Priority: Blocker > Fix For: 9.2 > > Attachments: Screenshot from 2022-11-09 11-20-44.png > > > During performance benchmarking on branch_9x, I observed a slowdown in > restart performance since commits in SOLR-16347. See attached screenshot. > CC [~gerlowskija]. > http://mostly.cool/cluster-test-with-patch.html > The benchmark is here: > https://github.com/fullstorydev/solr-bench/blob/ishan/repeatable-jenkins/suites/cluster-test.json. > This suite was run after retro-actively applying the parallelStream patch > from SOLR-16414: > https://github.com/apache/solr/commit/b33161d0cdd976fc0c3dc78c4afafceb4db671cf.diff > > Effort to automate these benchmarks is WIP and tracked here: SOLR-16525. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org