[ https://issues.apache.org/jira/browse/SOLR-15300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17317059#comment-17317059 ]
Jan Høydahl commented on SOLR-15300: ------------------------------------ {quote}Agreed. I would prefer to put it into each collection's props, perhaps using a less awkward name "liveState" ? after all, we already report here other calculated data that doesn't come from state.json, such as aliases and roles. {quote} You are right, we add "aliases", "conigName" and "znodeVersion". So adding a computed shard liveState would be possible. What should its possible values be? It is some kind of a composite/derived value... Perhaps three keys: {{"numReplicas": 5, "liveReplicas": 5, "shardLive": true}} ? > Shard "state" flag is confusing and of limited value to outside consumers > ------------------------------------------------------------------------- > > Key: SOLR-15300 > URL: https://issues.apache.org/jira/browse/SOLR-15300 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Andrzej Bialecki > Assignee: Andrzej Bialecki > Priority: Major > > Solr API (and consequently the metric reporters, which are often used for > Solr monitoring) report the shard as being in ACTIVE state even when in > reality its functionality is severely compromised (eg. no replicas, all > replicas down, or no leader). > This reported state is technically correct because it is used only for > tracking of the SPLITSHARD operations, as defined in {{Slice.State}}. > However, this may be misleading and more often unhelpful than not - for > constant monitoring a flag that actually reports impaired functionality of a > shard would be more useful than a flag that reports a relatively uncommon > SPLITSHARD operation. > We could either redefine the meaning of the existing flag (and change its > state according to some of the criteria I listed above), or add another flag > to represent the "health" status of a shard. The value of this flag would > then provide an easy way to monitor and to alert external systems of > dangerous function impairment, without monitoring the state of all replicas > of a collection. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org