Hi, Have a look at https://issues.apache.org/jira/browse/SOLR-15767 and assiciated PR which proposes some common alerting rules.
Jan > 4. feb. 2025 kl. 17:08 skrev Afroz, Neshat (NIH/NLM/NCBI) [C] > <neshat.af...@nih.gov.INVALID>: > > Hello > > I am setting up solr monitoring using Prometheus, solr-exporter and Grafana. > I also went ahead and installed alertmanager. Now my collection clusterstatus > looks as below: > > > curl -u "solradmin:xxxxxxxxxxxxx" > 'http://host01:8940/solr/admin/collections?action=CLUSTERSTATUS&wt=json&indent=true' > { > "responseHeader":{ > "status":0, > "QTime":1 > }, > "cluster":{ > "collections":{ > "test_coll1":{ > "pullReplicas":0, > "configName":"test_coll1", > "replicationFactor":1, > "router":{ > "name":"compositeId" > }, > "nrtReplicas":1, > "tlogReplicas":0, > "shards":{ > "shard1":{ > "range":"80000000-7fffffff", > "state":"active", > "stateTimestamp":"1733764681612235673", > "replicas":{ > "core_node2":{ > "core":"test_coll1_shard1_replica_n1", > "node_name":"host01:8940_solr", > "type":"NRT", > "state":"active", > "leader":"true", > "force_set_state":"false", > "base_url":http://host01:8940/solr > } > }, > "health":"GREEN" > } > }, > "health":"GREEN", > "znodeVersion":39, > "creationTimeMillis":1733764591523 > }, > "test_coll2":{ > "pullReplicas":0, > "configName":"test_coll2", > "replicationFactor":1, > "router":{ > "name":"compositeId" > }, > "nrtReplicas":1, > "tlogReplicas":0, > "shards":{ > "shard1":{ > "range":"80000000-7fffffff", > "state":"active", > "stateTimestamp":"1733765402371289174", > "replicas":{ > "core_node2":{ > "core":"test_coll2_shard1_replica_n1", > "node_name":"host01:8940_solr", > "type":"NRT", > "state":"active", > "leader":"true", > "force_set_state":"false", > "base_url":http://host01:8940/solr > } > }, > "health":"GREEN" > } > }, > "health":"GREEN", > "znodeVersion":39, > "creationTimeMillis":1733765317706 > } > }, > "live_nodes":["host01:8940_solr"] > } > } > > > I am looking to get alerted when the state of core changes. As per > https://solr.apache.org/guide/solr/latest/deployment-guide/cluster-node-management.html#clusterstatus > I can have 4 states i.e. red, orange, yellow and green. I am looking to > setup an alert for the same either through Grafana or alertmanager to be > emailed when either of these states happen. > > I have the below entry in rules.yml in Prometheus: > # Alert for collection core status solr_collections_shard_state > - alert: SolrCoreDown > # Condition for alerting > expr: solr_collections_shard_state < 1.00 > for: 1m > # Annotation - additional informational labels to store more information > annotations: > summary: "Solr shard {{ $labels.collection }}-{{ $labels.shard }} is > down" > description: "The Solr shard {{ $labels.collection }}-{{ $labels.shard > }} has been in a non-active state for 1 minutes." > > I don't get alerted with the above rule. I assume that it would be the same > expression or query that can be used in Grafana as well. I would appreciate > guidance with writing the rule/query above either in alertmanager or Grafana > to get the desired alerts. > > Thanks