wrunderwood commented on pull request #96: URL: https://github.com/apache/solr/pull/96#issuecomment-849021275
The existing CPU circuit breaker has an incorrect name, so yes, it is moved to a more accurate name. Load average can include things besides CPU. In some systems, it includes threads in iowait. So it should not have “CPU” in the name. The documentation explains what happens when the CPU circuit breaker is not available. Yes, the tests are similar because the two circuit breakers take similar inputs. wunder Walter Underwood ***@***.*** http://observer.wunderwood.org/ (my blog) > On May 26, 2021, at 3:43 AM, Atri Sharma ***@***.***> wrote: > > > @atris requested changes on this pull request. > > In general, I see some code duplication that should be abstracted. Please see the comments, thanks! > > In solr/core/src/java/org/apache/solr/util/circuitbreaker/CPUCircuitBreaker.java <https://github.com/apache/solr/pull/96#discussion_r639600967>: > > > @@ -110,6 +113,27 @@ public double getCpuUsageThreshold() { > } > > protected double calculateLiveCPUUsage() { > - return operatingSystemMXBean.getSystemLoadAverage(); > + Metric metric = this.core > + .getCoreContainer() > + .getMetricManager() > + .registry("solr.jvm") > + .getMetrics() > + .get("os.systemCpuLoad"); > I dont quite understand this change -- so you are replacing the default CPU circuit breaker with a new implementation and moving the earlier implementation to a new class? > > IMO, both the circuit breakers operate on CPU, using different metrics. Both circuit breakers should be named in a way which represents the fact that they operate on CPU and highlight which metric they depend on. > > In solr/core/src/java/org/apache/solr/util/circuitbreaker/CPUCircuitBreaker.java <https://github.com/apache/solr/pull/96#discussion_r639601381>: > > > + > + if (metric == null) { > + return -1.0; > + } > + > + if (metric instanceof Gauge) { > + @SuppressWarnings({"rawtypes"}) > + Gauge gauge = (Gauge) metric; > + // unwrap if needed > + if (gauge instanceof SolrMetricManager.GaugeWrapper) { > + gauge = ((SolrMetricManager.GaugeWrapper) gauge).getGauge(); > + } > + return ((Double) gauge.getValue()).doubleValue(); > + } > + > + return -1.0; // Unable to unpack metric > Please log a warning here -- we should not silently project that the system is not under load > > In solr/core/src/java/org/apache/solr/util/circuitbreaker/CircuitBreaker.java <https://github.com/apache/solr/pull/96#discussion_r639601638>: > > > > public CircuitBreakerConfig(final boolean enabled, final boolean memCBEnabled, final int memCBThreshold, > - final boolean cpuCBEnabled, final int cpuCBThreshold) { > + final boolean cpuCBEnabled, final int cpuCBThreshold, > Rename to highlight the metric (like you did for loadAverageCB renaming) > > In solr/core/src/java/org/apache/solr/util/circuitbreaker/CircuitBreakerManager.java <https://github.com/apache/solr/pull/96#discussion_r639602156>: > > > @@ -40,12 +41,15 @@ > * solution. There will be a follow up with a SIP for a schema API design. > */ > public class CircuitBreakerManager implements PluginInfoInitialized { > + private final SolrCore core; > I am wary of adding a SolrCore dependency here since only one circuit breaker really uses it and this pollutes the API a fair bit. > > In solr/core/src/java/org/apache/solr/util/circuitbreaker/LoadAverageCircuitBreaker.java <https://github.com/apache/solr/pull/96#discussion_r639603549>: > > > + > + @Override > + public boolean isTripped() { > + if (!isEnabled()) { > + return false; > + } > + > + if (!enabled) { > + return false; > + } > + > + double localAllowedLoadAverage = getLoadAverageThreshold(); > + double localSeenLoadAverage = calculateLiveLoadAverage(); > + > + if (localSeenLoadAverage < 0) { > + if (log.isWarnEnabled()) { > There is a fair bit of overlap between the two circuit breakers -- can we move the common code and methods to a super class, and define the contract which both (and future) CPU circuit breakers need to adhere to? > > In solr/solr-ref-guide/src/circuit-breakers.adoc <https://github.com/apache/solr/pull/96#discussion_r639604003>: > > > +recent load average for the whole system. A "load average" is the number of processes using or waiting for a CPU, > +usually averaged over one minute. Some systems include processes waiting on IO in the load average. Check the > +documentation for your system and JVM to understand this metric. For more information, see the > +https://en.wikipedia.org/wiki/Load_(computing)[Wikipedia page for Load], > + > +Configuration for load average circuit breaker: > + > +[source,xml] > +---- > +<str name="loadAverageEnabled">true</str> > +---- > + > +Note that this configuration will be overridden by the global circuit breaker flag -- if circuit breakers are disabled, this flag > +will not help you. > + > +The triggering threshold is a floating point number matching load average. This circuit breaker will trip when the > Not reviewing these changes -- will depend on Cassandra's review at some point > > In solr/core/src/test/org/apache/solr/util/TestCircuitBreaker.java <https://github.com/apache/solr/pull/96#discussion_r639604900>: > > > @@ -199,7 +201,63 @@ public void testFakeCPUCircuitBreaker() { > PluginInfo pluginInfo = h.getCore().getSolrConfig().getPluginInfo(CircuitBreakerManager.class.getName()); > > CircuitBreaker.CircuitBreakerConfig circuitBreakerConfig = CircuitBreakerManager.buildCBConfig(pluginInfo); > - CircuitBreaker circuitBreaker = new FakeCPUCircuitBreaker(circuitBreakerConfig); > + CircuitBreaker circuitBreaker = new FakeCPUCircuitBreaker(circuitBreakerConfig, null); > Please add a comment representing what the null stands for > > In solr/core/src/test/org/apache/solr/util/TestCircuitBreaker.java <https://github.com/apache/solr/pull/96#discussion_r639605898>: > > > + } finally { > + if (!executor.isShutdown()) { > + executor.shutdown(); > + } > + } > + } > + > + public void testFakeLoadAverageCircuitBreaker() { > + AtomicInteger failureCount = new AtomicInteger(); > + > + ExecutorService executor = ExecutorUtil.newMDCAwareCachedThreadPool( > + new SolrNamedThreadFactory("TestCircuitBreaker")); > + try { > + removeAllExistingCircuitBreakers(); > + > + PluginInfo pluginInfo = h.getCore().getSolrConfig().getPluginInfo(CircuitBreakerManager.class.getName()); > Isnt this test doing the exact same thing as the original CPU circuit breaker test, only using a different class? > > In solr/core/src/test/org/apache/solr/util/TestCircuitBreaker.java <https://github.com/apache/solr/pull/96#discussion_r639606722>: > > > } > > @Override > protected double calculateLiveCPUUsage() { > return 92; // Return a value large enough to trigger the circuit breaker > } > } > + > + private static class FakeLoadAverageCircuitBreaker extends LoadAverageCircuitBreaker { > + public FakeLoadAverageCircuitBreaker(CircuitBreakerConfig config) { > Again, in this class not an exact copy of FakeCPUCircuitBreaker? > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub <https://github.com/apache/solr/pull/96#pullrequestreview-668855220>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACR4IINRJPCTF5S4PNJZEK3TPTGD3ANCNFSM43T5FIZA>. > -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org