DennisBerger1984 commented on PR #1155: URL: https://github.com/apache/solr/pull/1155#issuecomment-1363701253
@stillalex Hi Alex, before I'll go into christmas holidays, we're back in january, we'd like to comment on your recent results. We think synchronized is currently faster because there is some substantial work inside the synchronized block missing, compared to a real world scenario. The reason for this is that the cloudDescriptor is null and therefore 3 `MDC.put()` calls in `MDCLoggingContext.setCoreDescriptor()` are getting executed and we've noticed that the core doesn't get closed anymore, but it should probably be closed inside the benchmark. First we'd like to thank you again for bringing up the discussion about how to prove that certain code paths are perfoming better or worse and to increase the understanding about locking in the solr community. We think this was a great idea and basically the question behind it is "does this PR help at all to make things better". Nevertheless we've difficulties to take conclusions from the outcome of this benchmark. What it basically does is it answers the question whether a function call to `getCoreFromAnyList()` without thread congestion and side effects performs better or worse when using sychronized vs. ReentrantReadWriteLock. The answer is that synchronized will perform similar or better depending on the number of CPU cores and that it uses far less resources. We suspect that the adaptive spinning of the lightweight locks used in synchronized in the JVM learns pretty quickly, that the benchmark consists mostly of a synchronized block and that there is not much that can run in parallel anyway and how to align the calls to `getCoreFromAnyList()` to get away with no heavy weight locking at all. So the problem is the synchronized behaviour of an entire solr when thread congestion, sideeffects like a big ThreadContext, other method calls grabbing the lock and possibly IO are happening. Then you can't guarantee that fast behaviour any more and it is very difficult to construct a micro benchmark that covers such cases. What ReentrantReadWriteLock can guarantee you is that it will never block read access even when you have congestion and lots of side effects, like slow MDC.put() calls, a big ThreadContext and so on. That's the most valuable advantage. In the new year we can do a macro (e.g. http) benchmark on our side again to verify a benefit and extract flight recorder dumps plus we plan to measure timings in each synchronized entrance and exit. Maybe we can find the root cause of long waits for the lock. But then again it's not about that our use case will be much faster then, it's about that you can construct a solr usage pattern which will severely suffer from synchroniz ed blocks but which will never suffer using a ReentrantReadWriteLock. Sincerly, @mpetris(Marco) and @debe (Dennis) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org