[jira] [Commented] (SOLR-17138) Support other QueryTimeout criteria

ASF subversion and git services (Jira) Tue, 20 Feb 2024 05:28:31 -0800


    [ 
https://issues.apache.org/jira/browse/SOLR-17138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17818811#comment-17818811
 ]


ASF subversion and git services commented on SOLR-17138:
--------------------------------------------------------

Commit b06495c2798b96604b19346eaa8b2b17caed0a9b in solr's branch 
refs/heads/branch_9x from Gus Heck
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=b06495c2798 ]

SOLR-17140 - 9x backport (#2284)

* SOLR-17140 - provide extensible query limit concept. (#2237)
This replaces SolrQueryTimeoutImpl.java with a much simpler SolrTimeLimit and 
provides a QueryLimits class to allow addition of additional types of limits. 
(see also SOLR-17138)

* SOLR-17140 ensure we always have limits. SolrRequestInfo is created by 
HttpSolrCall, so any code path not flowing through that needs to ensure 
SolrRequestInfo if it wants to use limit functionality.

* SOLR-17140 Limits should not be lost if we push a new request onto the stack. 
This makes it difficult to change limits for sub-requests, but I'm not 
convinced that such a thing makes sense anyway.

(cherry picked from commit 07f65499723d3190041849a7b3e2d70593b3eda5)

* SOLR-17140 Account for AnalyticsHandler in 9x 

> Support other QueryTimeout criteria
> -----------------------------------
>
>                 Key: SOLR-17138
>                 URL: https://issues.apache.org/jira/browse/SOLR-17138
>             Project: Solr
>          Issue Type: New Feature
>          Components: Query Limits
>            Reporter: Andrzej Bialecki
>            Priority: Major
>
> Complex Solr queries can consume significant memory and CPU while being 
> processed. When OOM or CPU saturation is reached Solr becomes unresponsive, 
> which further compounds the problem. Often such “killer queries” are not 
> written to logs, which makes them difficult to diagnose. This happens even 
> with best practices in place.
> It should be possible to set limits in Solr that cannot be exceeded by 
> individual queries. This mechanism would monitor an accumulating “cost” of a 
> query while it’s being executed and compare it to the configured maximum cost 
> (budget), expressed in terms of CPU and/or memory usage that can be 
> attributed to this query. Should these limits be exceeded the individual 
> query execution should be terminated, without affecting other concurrently 
> executing queries.
> The CircuitBreakers functionality doesn't distinguish the source of the load 
> and can't protect other query executions from a particular runaway query. We 
> need a more fine-grained mechanism.
> The existing {{QueryTimeout}} API enables such termination of individual 
> queries. However, the existing implementation ({{SolrQueryTimeoutImpl}} used 
> with {{timeAllowed}} query param) only uses elapsed wall-clock time as the 
> termination criterion. This is insufficient - in case of resource contention 
> the wall-clock time doesn’t represent correctly the actual CPU cost of 
> executing a particular query. A query may produce results after a long time 
> not because of its complexity or bad behavior but because of the general 
> resource contention caused by other concurrently executing queries. OTOH a 
> single runaway query may consume all resources and cause all other valid 
> queries to fail if they exceed the wall-clock {{timeAllowed}}.
> I propose adding two additional criteria for limiting the maximum "query 
> budget":
>  * per-thread CPU time: using {{getThreadCpuTime}} to periodically check 
> ({{QueryTimeout.shouldExit()}}) the current CPU consumption since the start 
> of the query execution.
>  * per-thread memory allocation: using {{getThreadAllocatedBytes}}.
> I ran some JMH microbenchmarks to ensure that these two methods are available 
> on modern OS/JVM combinations and their cost is negligible (less than 0.5 
> us/call). This means that the initial implementation may call these methods 
> directly for every {{shouldExit()}} call without undue burden. If we decide 
> that this still adds too much overhead we can change this to periodic updates 
> in a background thread.
> These two "query budget" constraints can be implemented as subclasses of 
> {{QueryTimeout}}. Initially we can use a similar configuration mechanism as 
> with {{timeAllowed}}, i.e. pass the max value as a query param, or add it to 
> the search handler's invariants.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Commented] (SOLR-17138) Support other QueryTimeout criteria

Reply via email to