Hey folks, you might remember me posting a while back about a problem we were having with upgrading: We were using 8.3 with no major problems, but when we tried upgrading to the latest Solr we saw slower response times and increased failures, to the point where we couldn't even consider upgrading.
Initial suspicion fell on the JVM's GC because the heap was getting huge. We also worried about thread creation, disk performance, and a few other things. Along the way we spotted that our issue seemed to be connected to external files that were getting read after every refresh, since emptying the files cured the issue. But we needed them for boosts! Long story short, with some help from the good folks at OpenSourceConnections, we finally got things resolved! It turns out that our problem was caused by refreshes causing followers to switch to a new searcher, which blocked until it had imported the external file. So first we tried adding the config suggested by the docs to set up listeners: <listener event="newSearcher" class="org.apache.solr.schema.ExternalFileFieldReloader"/> <listener event="firstSearcher" class="org.apache.solr.schema.ExternalFileFieldReloader"/> These initially seemed to break everything - solr simply didn't want to restart with these lines in place. Eventually though, we discovered that it *did* restart if left long enough (over ten mins!) and after finally starting up with this config in place, our performance issues largely went away! We removed some soft commit config we had on our followers since there was no real need for it there and it was part of the reason for slowdown; we also changed to the StandardDirectoryFactory for best possible performance. This brought the restart down to a mere(!) three minutes, which was just about acceptable for our use case. All these changes together gave us graceful switching over from old searcher to new without losing any requests. We also added lifecycle hooks to our AWS instances to ensure that our followers would be put into service as soon as they were fully refreshed and up-to-date, but not before. This removed our worries about any slowness on startup meaning out-of-date (or failed) search results. We're halfway through upgrading our Solr infrastructure now, to 8.11.1 and we're seeing great results - we actually get less errors than with 8.3 - so we're expecting to be fully updated by the end of the week, and with better performance to boot! Thanks all for your advice in the long process of troubleshooting this issue, and hopefully this write-up will give anybody else suffering in the same way some ideas on how to solve it :) Dom