[ 
https://issues.apache.org/jira/browse/SOLR-15767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Potter updated SOLR-15767:
----------------------------------
    Description: 
As promised in my ApacheCon talk (recording here: 
https://www.youtube.com/watch?v=mR_3bbqZSRg), I'd like to ship a core set of 
alerting rules for monitoring SolrCloud clusters specifically on Kubernetes.

The basic guidance will be for users to import these rules into the Prometheus 
stack (which provides a {{PrometheusRule}} CRD). Users will need to adjust the 
thresholds for their specific uses (e.g. p95) and alert interval. Eventually, 
I'd like to have a starter runbook to accompany the alert rules, but not there 
yet.

Also, it's important to realize that the Prometheus stack provides a core set 
of alerting rules for monitoring node and pod health, so the Solr specific set 
should be considered complementary to those and users will be expected to use 
both sets, see: 
https://github.com/prometheus-operator/kube-prometheus/blob/0821adabf6f9f7aebf9343ecb07707826ce693ee/manifests/kubernetes-prometheusRule.yaml

If users are not running in Kubernetes or not using the Prometheus stack, the 
rules should still be useful as a getting started guide on what to monitor.

  was:
As promised in my ApacheCon talk (recording here: 
https://www.youtube.com/watch?v=mR_3bbqZSRg), I'd like to ship a core set of 
alerting rules for monitoring SolrCloud clusters specifically on Kubernetes.

The basic guidance will be for users to import these rules into the Prometheus 
stack (which provides a {{PrometheusRule}} CRD). Users will need to adjust the 
thresholds for their specific uses (e.g. p95) and alert interval. Eventually, 
I'd like to have a starter runbook to accompany the alert rules, but not there 
yet.

If users are not running in Kubernetes or not using the Prometheus stack, the 
rules should still be useful as a getting started guide on what to monitor.


> Include a core set of Prometheus alert rules for monitoring SolrCloud on K8s
> ----------------------------------------------------------------------------
>
>                 Key: SOLR-15767
>                 URL: https://issues.apache.org/jira/browse/SOLR-15767
>             Project: Solr
>          Issue Type: New Feature
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: contrib - prometheus-exporter
>            Reporter: Timothy Potter
>            Assignee: Timothy Potter
>            Priority: Major
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> As promised in my ApacheCon talk (recording here: 
> https://www.youtube.com/watch?v=mR_3bbqZSRg), I'd like to ship a core set of 
> alerting rules for monitoring SolrCloud clusters specifically on Kubernetes.
> The basic guidance will be for users to import these rules into the 
> Prometheus stack (which provides a {{PrometheusRule}} CRD). Users will need 
> to adjust the thresholds for their specific uses (e.g. p95) and alert 
> interval. Eventually, I'd like to have a starter runbook to accompany the 
> alert rules, but not there yet.
> Also, it's important to realize that the Prometheus stack provides a core set 
> of alerting rules for monitoring node and pod health, so the Solr specific 
> set should be considered complementary to those and users will be expected to 
> use both sets, see: 
> https://github.com/prometheus-operator/kube-prometheus/blob/0821adabf6f9f7aebf9343ecb07707826ce693ee/manifests/kubernetes-prometheusRule.yaml
> If users are not running in Kubernetes or not using the Prometheus stack, the 
> rules should still be useful as a getting started guide on what to monitor.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to