janhoy commented on code in PR #96: URL: https://github.com/apache/solr/pull/96#discussion_r1327778603
########## solr/solr-ref-guide/modules/deployment-guide/pages/circuit-breakers.adoc: ########## @@ -24,12 +24,19 @@ resource configuration. Circuit breakers should be used when the user wishes to trade request throughput for a higher Solr stability. If circuit breakers are enabled, requests may be rejected under the condition of high node duress with HTTP error code 429 'Too Many Requests'. -It is up to the client to handle this error and potentially build a retrial logic as this should ideally be a transient situation. +It is up to the client to handle this error and potentially build retry logic as this should be a transient situation. + +In a sharded collection, when a circuit breaker trips on one shard, the entire query will fail, +even if the other shard requests succeed. This will multiply the failures seen by the end users. +Setting the `shards.tolerant=true` parameter on requests can help with graceful degradation when +circuit breaker thresholds are reached on some nodes. See the <<shards.tolerant Parameter>> for details. == Circuit Breaker Configurations All circuit breaker configurations are listed as independent `<circuitBreaker>` entries in `solrconfig.xml` as shown below. A circuit breaker can register itself to trip for query requests and/or update requests. By default only search requests are affected. A user may register multiple circuit breakers of the same type with different thresholds for each request type. +Configuring circuit breakers for update requests in a collection with NRT replicas can result in +inconsistent contents in different replicas. Review Comment: Do we know this? Does the DistributedUpdateProcessor re-try a failed update? Shuold this warning be in a asciidoc warn/note block? Obviously we should proceed with only tripping Circuit breakers for external requests. I filed https://issues.apache.org/jira/browse/SOLR-16982 for this ########## solr/core/src/java/org/apache/solr/util/circuitbreaker/CPUCircuitBreaker.java: ########## @@ -84,11 +79,45 @@ public String getErrorMessage() { + allowedCPUUsage.get(); } + public void setThreshold(double thresholdValueInPercentage) { + if (thresholdValueInPercentage > 100) { + throw new IllegalArgumentException("Invalid Invalid threshold value."); + } + + if (thresholdValueInPercentage <= 0) { + throw new IllegalStateException("Threshold cannot be less than or equal to zero"); + } + cpuUsageThreshold = thresholdValueInPercentage; + } + public double getCpuUsageThreshold() { return cpuUsageThreshold; } protected double calculateLiveCPUUsage() { - return operatingSystemMXBean.getSystemLoadAverage(); + // TODO: Use Codahale Meter to calculate the value + Metric metric = + this.core + .getCoreContainer() + .getMetricManager() + .registry("solr.jvm") + .getMetrics() + .get("os.systemCpuLoad"); + + if (metric == null) { + return -1.0; + } + + if (metric instanceof Gauge) { + @SuppressWarnings({"rawtypes"}) + Gauge gauge = (Gauge) metric; + // unwrap if needed + if (gauge instanceof SolrMetricManager.GaugeWrapper) { + gauge = ((SolrMetricManager.GaugeWrapper) gauge).getGauge(); + } + return (Double) gauge.getValue(); Review Comment: Return as percentage ```suggestion return (Double) gauge.getValue() * 100; ``` ########## solr/core/src/java/org/apache/solr/util/circuitbreaker/CPUCircuitBreaker.java: ########## @@ -84,11 +79,45 @@ public String getErrorMessage() { + allowedCPUUsage.get(); } + public void setThreshold(double thresholdValueInPercentage) { + if (thresholdValueInPercentage > 100) { + throw new IllegalArgumentException("Invalid Invalid threshold value."); + } + + if (thresholdValueInPercentage <= 0) { + throw new IllegalStateException("Threshold cannot be less than or equal to zero"); + } + cpuUsageThreshold = thresholdValueInPercentage; + } + public double getCpuUsageThreshold() { return cpuUsageThreshold; } protected double calculateLiveCPUUsage() { Review Comment: Document method ```suggestion /** * Calculate the CPU usage for the system in percentage. * * @return Percent CPU usage of -1 if value could not be obtained. */ protected double calculateLiveCPUUsage() { ``` ########## solr/solr-ref-guide/modules/deployment-guide/pages/circuit-breakers.adoc: ########## @@ -48,19 +63,25 @@ To enable and configure the JVM heap usage based circuit breaker, add the follow The `threshold` is defined as a percentage of the max heap allocated to the JVM. -It does not logically make sense to have a threshold below 50% and above 95% of the max heap allocated to the JVM. +For the circuit breaker configuration, a value of "0" maps to 0% usage and a value of "100" maps to 100% usage. + +It does not logically make sense to have a threshold below 50% or above 95% of the max heap allocated to the JVM. Hence, the range of valid values for this parameter is [50, 95], both inclusive. Consider the following example: JVM has been allocated a maximum heap of 5GB (-Xmx) and `threshold` is set to `75`. In this scenario, the heap usage at which the circuit breaker will trip is 3.75GB. -=== CPU Utilization +=== System CPU Usage Circuit Breaker +This circuit breaker tracks system CPU usage and triggers if the recent CPU usage exceeds a configurable threshold. + +This is tracked with the JMX metric `OperatingSystemMXBean.getSystemCpuLoad()`. That measures the +recent CPU usage for the whole system. A value of 0.0 means that all CPUs were idle, while a value +of 1.0 means that all CPUs were actively running 100% of the time. Review Comment: We should not confuse users with mentioning `1.0` here, when the rest of the docs are about percentage. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org