[ https://issues.apache.org/jira/browse/SOLR-17138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17818811#comment-17818811 ]
ASF subversion and git services commented on SOLR-17138: -------------------------------------------------------- Commit b06495c2798b96604b19346eaa8b2b17caed0a9b in solr's branch refs/heads/branch_9x from Gus Heck [ https://gitbox.apache.org/repos/asf?p=solr.git;h=b06495c2798 ] SOLR-17140 - 9x backport (#2284) * SOLR-17140 - provide extensible query limit concept. (#2237) This replaces SolrQueryTimeoutImpl.java with a much simpler SolrTimeLimit and provides a QueryLimits class to allow addition of additional types of limits. (see also SOLR-17138) * SOLR-17140 ensure we always have limits. SolrRequestInfo is created by HttpSolrCall, so any code path not flowing through that needs to ensure SolrRequestInfo if it wants to use limit functionality. * SOLR-17140 Limits should not be lost if we push a new request onto the stack. This makes it difficult to change limits for sub-requests, but I'm not convinced that such a thing makes sense anyway. (cherry picked from commit 07f65499723d3190041849a7b3e2d70593b3eda5) * SOLR-17140 Account for AnalyticsHandler in 9x > Support other QueryTimeout criteria > ----------------------------------- > > Key: SOLR-17138 > URL: https://issues.apache.org/jira/browse/SOLR-17138 > Project: Solr > Issue Type: New Feature > Components: Query Limits > Reporter: Andrzej Bialecki > Priority: Major > > Complex Solr queries can consume significant memory and CPU while being > processed. When OOM or CPU saturation is reached Solr becomes unresponsive, > which further compounds the problem. Often such “killer queries” are not > written to logs, which makes them difficult to diagnose. This happens even > with best practices in place. > It should be possible to set limits in Solr that cannot be exceeded by > individual queries. This mechanism would monitor an accumulating “cost” of a > query while it’s being executed and compare it to the configured maximum cost > (budget), expressed in terms of CPU and/or memory usage that can be > attributed to this query. Should these limits be exceeded the individual > query execution should be terminated, without affecting other concurrently > executing queries. > The CircuitBreakers functionality doesn't distinguish the source of the load > and can't protect other query executions from a particular runaway query. We > need a more fine-grained mechanism. > The existing {{QueryTimeout}} API enables such termination of individual > queries. However, the existing implementation ({{SolrQueryTimeoutImpl}} used > with {{timeAllowed}} query param) only uses elapsed wall-clock time as the > termination criterion. This is insufficient - in case of resource contention > the wall-clock time doesn’t represent correctly the actual CPU cost of > executing a particular query. A query may produce results after a long time > not because of its complexity or bad behavior but because of the general > resource contention caused by other concurrently executing queries. OTOH a > single runaway query may consume all resources and cause all other valid > queries to fail if they exceed the wall-clock {{timeAllowed}}. > I propose adding two additional criteria for limiting the maximum "query > budget": > * per-thread CPU time: using {{getThreadCpuTime}} to periodically check > ({{QueryTimeout.shouldExit()}}) the current CPU consumption since the start > of the query execution. > * per-thread memory allocation: using {{getThreadAllocatedBytes}}. > I ran some JMH microbenchmarks to ensure that these two methods are available > on modern OS/JVM combinations and their cost is negligible (less than 0.5 > us/call). This means that the initial implementation may call these methods > directly for every {{shouldExit()}} call without undue burden. If we decide > that this still adds too much overhead we can change this to periodic updates > in a background thread. > These two "query budget" constraints can be implemented as subclasses of > {{QueryTimeout}}. Initially we can use a similar configuration mechanism as > with {{timeAllowed}}, i.e. pass the max value as a query param, or add it to > the search handler's invariants. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org