rpuch commented on code in PR #4614:
URL: https://github.com/apache/ignite-3/pull/4614#discussion_r1825728926


##########
modules/cli/src/main/java/org/apache/ignite/internal/cli/decorators/ClusterStatusDecorator.java:
##########
@@ -45,4 +46,17 @@ public TerminalOutput decorate(ClusterStatus data) {
                         data.nodeCount(), fg(Color.RED).mark("not initialized")
                 ));
     }
+
+    private static String status(ClusterStatus status) {
+        switch (status) {
+            case MS_MAJORITY_LOST:
+                return fg(Color.RED).mark("Metastore majority lost");
+            case HEALTHY:
+                return fg(Color.GREEN).mark("active");
+            case CMG_MAJORITY_LOST:
+                return fg(Color.RED).mark("CMG majority lost");

Review Comment:
   The ordering is a bit strange: why is `HEALTHY` in the middle?



##########
modules/rest-api/src/main/java/org/apache/ignite/internal/rest/api/cluster/ClusterState.java:
##########
@@ -34,24 +36,31 @@
  */
 @Schema(description = "Information about current cluster state.")
 public class ClusterState {
-    @Schema(description = "List of cluster management group nodes. These nodes 
are responsible for maintaining RAFT cluster topology.")
+    @Schema(description = "List of cluster management group nodes. These nodes 
are responsible for maintaining RAFT cluster topology.",
+            requiredMode = RequiredMode.REQUIRED)
     @IgniteToStringInclude
     private final Collection<String> cmgNodes;
 
-    @Schema(description = "List of metastorage nodes. These nodes are 
responsible for storing RAFT cluster metadata.")
+    @Schema(description = "List of metastorage nodes. These nodes are 
responsible for storing RAFT cluster metadata.",
+            requiredMode = RequiredMode.REQUIRED)
     @IgniteToStringInclude
     private final Collection<String> msNodes;
 
-    @Schema(description = "Version of Apache Ignite that the cluster was 
created on.")
+    @Schema(description = "Version of Apache Ignite that the cluster was 
created on.", requiredMode = RequiredMode.REQUIRED)
     private final String igniteVersion;
 
-    @Schema(description = "Unique tag that identifies the cluster.")
+    @Schema(description = "Unique tag that identifies the cluster.",
+            requiredMode = RequiredMode.REQUIRED)
     private final ClusterTag clusterTag;
 
     @Schema(description = "IDs the cluster had before.")
     @IgniteToStringInclude
     private final @Nullable List<UUID> formerClusterIds;
 
+    @Schema(description = "Cluster status.",
+            requiredMode = RequiredMode.REQUIRED)
+    private final ClusterStatus clusterStatus;

Review Comment:
   It seems that entities from different domains are mixed here. This 
`ClusterState` class represents the CMG cluster state that is the state stored 
in the CMG related to the cluster itself. But `ClusterStatus` is some volatile 
thing, it might change during runtime however it wants, even if the CMG cluster 
state does not change.
   
   I believe we should not mix them together. Could a distinct API endpoint 
(and CLI command) be added to access this volatile information?



##########
modules/cli/src/main/java/org/apache/ignite/internal/cli/call/cluster/status/ClusterState.java:
##########
@@ -38,15 +39,26 @@ public class ClusterStatus {
 
     private final List<String> metadataStorageNodes;
 
-    private ClusterStatus(int nodeCount, boolean initialized, String name,
-            boolean connected, String nodeUrl, List<String> cmgNodes, 
List<String> metadataStorageNodes) {
+    private final ClusterStatus clusterStatus;
+
+    private ClusterState(
+            int nodeCount,
+            boolean initialized,
+            String name,
+            boolean connected,
+            String nodeUrl,
+            List<String> cmgNodes,
+            List<String> metadataStorageNodes,
+            ClusterStatus clusterStatus
+    ) {
         this.nodeCount = nodeCount;
         this.initialized = initialized;
         this.name = name;
         this.connected = connected;
         this.nodeUrl = nodeUrl;
         this.cmgNodes = cmgNodes;
         this.metadataStorageNodes = metadataStorageNodes;

Review Comment:
   How about making defensive copies? Better safe than sorry, and this class 
should not have a huge amount of instances



##########
modules/rest-api/src/main/java/org/apache/ignite/internal/rest/api/cluster/ClusterStatus.java:
##########
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.ignite.internal.rest.api.cluster;
+
+/**
+ * Current health status of the cluster.
+ */
+public enum ClusterStatus {
+    /**
+     * The cluster is completely healthy. Minor losses in any of the groups 
are possible,
+     * but this does not affect the operation of the cluster.
+     */
+    HEALTHY,
+
+    /**
+     * The metastore group has lost its majority. Almost all of the cluster 
functions are inoperative.
+     * To restore their operation, it is necessary to return the majority to 
the metastore group.

Review Comment:
   ```suggestion
        * To restore their operation, it is necessary to restore the majority 
to the metastore group.
   ```



##########
modules/rest/src/main/java/org/apache/ignite/internal/rest/cluster/ClusterManagementController.java:
##########
@@ -84,16 +113,31 @@ public CompletableFuture<Void> init(@Body InitCommand 
initCommand) {
         });
     }
 
-    private static ClusterState 
mapClusterState(org.apache.ignite.internal.cluster.management.ClusterState 
clusterState) {
+    private ClusterState 
mapClusterState(org.apache.ignite.internal.cluster.management.ClusterState 
clusterState) {
         return new ClusterState(
                 clusterState.cmgNodes(),
                 clusterState.metaStorageNodes(),
                 clusterState.igniteVersion().toString(),
                 new ClusterTag(clusterState.clusterTag().clusterName(), 
clusterState.clusterTag().clusterId()),
-                clusterState.formerClusterIds()
+                clusterState.formerClusterIds(),
+                mapClusterStatus(clusterState)
         );
     }
 
+    private ClusterStatus 
mapClusterStatus(org.apache.ignite.internal.cluster.management.ClusterState 
clusterState) {
+        Set<String> metaStorageNodes = clusterState.metaStorageNodes();
+        long presentMetaStorageNodes = topologyService.allMembers().stream()
+                .map(ClusterNode::name)
+                .filter(metaStorageNodes::contains)
+                .count();
+
+        if (presentMetaStorageNodes <= metaStorageNodes.size() / 2) {
+            return ClusterStatus.MS_MAJORITY_LOST;

Review Comment:
   The fact that we have enough nodes in the physical topology does not 
necessarily imply we have an MG majority. For instance, something on levels 
higher than the network might prevent a leader to be elected. The most accurate 
way would be to try executing a read command against the MS group (like reading 
its current revision) and handle a timeout.



##########
modules/cli/src/main/java/org/apache/ignite/internal/cli/call/cluster/status/ClusterState.java:
##########
@@ -18,11 +18,12 @@
 package org.apache.ignite.internal.cli.call.cluster.status;
 
 import java.util.List;
+import org.apache.ignite.rest.client.model.ClusterStatus;
 
 /**
  * Class that represents the cluster status.
  */
-public class ClusterStatus {
+public class ClusterState {

Review Comment:
   We already have an entity called 'cluster state' in the CMG domain. This one 
is a different entity and yet it's called identically. Could a distinct name be 
devised? Something like ClusterRuntimeState, maybe.



##########
modules/rest-api/src/main/java/org/apache/ignite/internal/rest/api/cluster/ClusterStatus.java:
##########
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.ignite.internal.rest.api.cluster;
+
+/**
+ * Current health status of the cluster.
+ */
+public enum ClusterStatus {
+    /**
+     * The cluster is completely healthy. Minor losses in any of the groups 
are possible,
+     * but this does not affect the operation of the cluster.
+     */
+    HEALTHY,
+
+    /**
+     * The metastore group has lost its majority. Almost all of the cluster 
functions are inoperative.
+     * To restore their operation, it is necessary to return the majority to 
the metastore group.
+     */
+    MS_MAJORITY_LOST,
+
+    /**
+     * The cluster management group has lost its majority. The cluster is 
completely inoperative until the majority is returned.

Review Comment:
   ```suggestion
        * The cluster management group has lost its majority. The cluster is 
completely inoperative until the majority is restored.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscr...@ignite.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to