timeAllowed parameter is a not a good choice for rate limiting and could crash the whole Solr cluster. In fact, timeAllowed parameter should increase the chances of crashing the whole cluster:
When the timeAllowed for a query is over, it's client will get a failure but the server handling the query itself will not kill the thread running that query. So Solr itself would still be working on that long-running query but the client has got a timeOut. These failure-receiving client-threads are now free to process other requests: retry failed ones or fire new queries to Solr. This should suffocate Solr even more, although client application's threads will not get blocked ever. With a rate limiter, we save both - clients' extra traffic gets rejected-responses and all Solr nodes breathe easy too. IMO, timeAllowed parameter will almost always kill the whole Solr cluster. -SG On Fri, Aug 4, 2017 at 3:30 PM, Varun Thacker <[email protected]> wrote: > Hi Hrishikesh, > > I think SOLR-7344 is probably an important addition to Solr. It could help > users isolate analytical queries ( streaming ) , search queries and > indexing requests and throttle requests > > Let's continue the discussion on the Jira > > On Thu, Aug 3, 2017 at 2:03 AM, Rick Leir <[email protected]> wrote: > > > > > > > On 2017-08-02 11:33 PM, Shawn Heisey wrote: > > > >> On 8/2/2017 8:41 PM, S G wrote: > >> > >>> Problem is that peak load estimates are just estimates. > >>> It would be nice to enforce them from Solr side such that if a rate > >>> higher than that is seen at any core, the core will automatically > begin to > >>> reject the requests. > >>> Such a feature would contribute to cluster stability while making sure > >>> the customer gets an exception to remind them of a slower rate. > >>> > >> Solr doesn't have anything like this. This is primarily because there > >> is no network server code in Solr. The networking is provided by the > >> servlet container. The container in modern Solr versions is nearly > >> guaranteed to be Jetty. As long as I have been using Solr, it has > >> shipped with a Jetty container. > >> > >> https://wiki.apache.org/solr/WhyNoWar > >> > >> I have no idea whether Jetty is capable of the kind of rate limiting > >> you're after. If it is, it would be up to you to figure out the > >> configuration. > >> > >> You could always put a proxy server like haproxy in front of Solr. I'm > >> pretty sure that haproxy is capable rejecting connections when the > >> request rate gets too high. Other proxy servers (nginx, apache, F5 > >> BigIP, solutions from Microsoft, Cisco, etc) are probably also capable > >> of this. > >> > >> IMHO, intentionally causing connections to fail when a limit is exceeded > >> would not be a very good idea. When the rate gets too high, the first > >> thing that happens is all the requests slow down. The slowdown could be > >> dramatic. As the rate continues to increase, some of the requests > >> probably would begin to fail. > >> > >> What you're proposing would be guaranteed to cause requests to fail. > >> Failing requests are even more likely than slow requests to result in > >> users finding a new source for whatever service they are getting from > >> your organization. > >> > > Shawn, > > Agreed, a connection limit is not a good idea. But there is the > > timeAllowed parameter <https://cwiki.apache.org/conf > > luence/display/solr/Common+Query+Parameters#CommonQueryPa > > rameters-ThetimeAllowedParameter> > > timeAllowed - This parameter specifies the amount of time, in > > milliseconds, allowed for a search to complete. If this time expires > before > > the search is complete, any partial results will be returned. > > > > https://stackoverflow.com/questions/19557476/timing-out-a-query-in-solr > > > > With timeAllowed, you need not estimate what connection rate is > > unbearable. Rather, you would set a max response time. If some queries > take > > much longer than other queries, then this would cause the long ones to > > fail, which might be a good strategy. However, if queries normally all > take > > about the same time, then this would cause all queries to return partial > > results until the server recovers, which might be a bad strategy. In this > > case, Walter's post is sensible. > > > > A previous thread suggested that timeAllowed could cause bad performance > > on some cloud servers. > > cheers -- Rick > > > > > > > > > > >
