[ https://issues.apache.org/jira/browse/SOLR-17150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17815673#comment-17815673 ]
Andrzej Bialecki commented on SOLR-17150: ----------------------------------------- Here's the proposed approach to implement two thresholds: * an absolute max limit to terminate any query that exceeds this allocation * a relative dynamic limit to terminate queries that exceed "typical" allocation For the absolute limit: as with other implementations, {{memAllowed}} would set the absolute limit per query (float value in megabytes?). In order to accommodate initial queries this should be set to a relatively high value, which isn't optimal later for typical queries - this higher limit will eventually catch runaway queries but not before they consume significant memory. For the dynamic limit: a histogram would be added to the metrics to track the recent memory usage per query (using exponentially decaying reservoir). The life-cycle of the histogram could be tied either to SolrCore or to SolrIndexSearcher (the latter seems more appropriate because of the warmup queries that would skew the longer-term stats in SolrCore's life-cycle). After collecting sufficient number of data points (eg. {{{}N = 100{}}}) the component could start enforcing a dynamic limit based on a formula that takes into account the "typical" recent queries. For example: {{{}dynamicThreshold = X * p99{}}}, where {{X = 2.0}} by default. Open issues: * does the dynamic threshold make sense? does the formula make sense? * I think that both the static and dynamic limits should be optional, ie. some combination of query params should allow user to skip the enforcement of either / both. * since the dynamic limit involves parameters (at least N and X above) that determine long-term tracking it can no longer be expressed just as short-lived query params, it needs a configuration with a life-cycle of SolrCore or longer. Where should we put this configuration? > Create MemQueryLimit implementation > ----------------------------------- > > Key: SOLR-17150 > URL: https://issues.apache.org/jira/browse/SOLR-17150 > Project: Solr > Issue Type: Sub-task > Security Level: Public(Default Security Level. Issues are Public) > Components: Query Limits > Reporter: Andrzej Bialecki > Assignee: Andrzej Bialecki > Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > An implementation of {{QueryTimeout}} that terminates misbehaving queries > that allocate too much memory for their execution. > This is a bit more complicated than {{CpuQueryLimits}} because the first time > a query is submitted it may legitimately allocate many sizeable objects > (caches, field values, etc). So we want to catch and terminate queries that > either exceed any reasonable threshold (eg. 2GB), or significantly exceed a > time-weighted percentile of the recent queries. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org