[ https://issues.apache.org/jira/browse/SOLR-15767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17700055#comment-17700055 ]
Jan Høydahl commented on SOLR-15767: ------------------------------------ I was just looking into this myself. What remains to get this in? I can test these out in a test cluster of mine and give some feedback. Also, it seems as if the prometheus operator installs a bunch of alerts by default. Should we in solr-operator have a simple way of installing a set of "safe", universal rules? Perhaps not the "subjective" QPS rules, but the set of rules that clearly indicate "yellow" or "red" cluster state. > Include a core set of Prometheus alert rules for monitoring SolrCloud on K8s > ---------------------------------------------------------------------------- > > Key: SOLR-15767 > URL: https://issues.apache.org/jira/browse/SOLR-15767 > Project: Solr > Issue Type: New Feature > Components: contrib - prometheus-exporter > Reporter: Timothy Potter > Assignee: Timothy Potter > Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > As promised in my ApacheCon talk (recording here: > https://www.youtube.com/watch?v=mR_3bbqZSRg), I'd like to ship a core set of > alerting rules for monitoring SolrCloud clusters specifically on Kubernetes. > The basic guidance will be for users to import these rules into the > Prometheus stack (which provides a {{PrometheusRule}} CRD). Users will need > to adjust the thresholds for their specific uses (e.g. p95) and alert > interval. Eventually, I'd like to have a starter runbook to accompany the > alert rules, but not there yet. > Also, it's important to realize that the Prometheus stack provides a core set > of alerting rules for monitoring node and pod health, so the Solr specific > set should be considered complementary to those and users will be expected to > use both sets, see: > https://github.com/prometheus-operator/kube-prometheus/blob/0821adabf6f9f7aebf9343ecb07707826ce693ee/manifests/kubernetes-prometheusRule.yaml > If users are not running in Kubernetes or not using the Prometheus stack, the > rules should still be useful as a getting started guide on what to monitor. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org