rpuch commented on code in PR #4614: URL: https://github.com/apache/ignite-3/pull/4614#discussion_r1825728926
########## modules/cli/src/main/java/org/apache/ignite/internal/cli/decorators/ClusterStatusDecorator.java: ########## @@ -45,4 +46,17 @@ public TerminalOutput decorate(ClusterStatus data) { data.nodeCount(), fg(Color.RED).mark("not initialized") )); } + + private static String status(ClusterStatus status) { + switch (status) { + case MS_MAJORITY_LOST: + return fg(Color.RED).mark("Metastore majority lost"); + case HEALTHY: + return fg(Color.GREEN).mark("active"); + case CMG_MAJORITY_LOST: + return fg(Color.RED).mark("CMG majority lost"); Review Comment: The ordering is a bit strange: why is `HEALTHY` in the middle? ########## modules/rest-api/src/main/java/org/apache/ignite/internal/rest/api/cluster/ClusterState.java: ########## @@ -34,24 +36,31 @@ */ @Schema(description = "Information about current cluster state.") public class ClusterState { - @Schema(description = "List of cluster management group nodes. These nodes are responsible for maintaining RAFT cluster topology.") + @Schema(description = "List of cluster management group nodes. These nodes are responsible for maintaining RAFT cluster topology.", + requiredMode = RequiredMode.REQUIRED) @IgniteToStringInclude private final Collection<String> cmgNodes; - @Schema(description = "List of metastorage nodes. These nodes are responsible for storing RAFT cluster metadata.") + @Schema(description = "List of metastorage nodes. These nodes are responsible for storing RAFT cluster metadata.", + requiredMode = RequiredMode.REQUIRED) @IgniteToStringInclude private final Collection<String> msNodes; - @Schema(description = "Version of Apache Ignite that the cluster was created on.") + @Schema(description = "Version of Apache Ignite that the cluster was created on.", requiredMode = RequiredMode.REQUIRED) private final String igniteVersion; - @Schema(description = "Unique tag that identifies the cluster.") + @Schema(description = "Unique tag that identifies the cluster.", + requiredMode = RequiredMode.REQUIRED) private final ClusterTag clusterTag; @Schema(description = "IDs the cluster had before.") @IgniteToStringInclude private final @Nullable List<UUID> formerClusterIds; + @Schema(description = "Cluster status.", + requiredMode = RequiredMode.REQUIRED) + private final ClusterStatus clusterStatus; Review Comment: It seems that entities from different domains are mixed here. This `ClusterState` class represents the CMG cluster state that is the state stored in the CMG related to the cluster itself. But `ClusterStatus` is some volatile thing, it might change during runtime however it wants, even if the CMG cluster state does not change. I believe we should not mix them together. Could a distinct API endpoint (and CLI command) be added to access this volatile information? ########## modules/cli/src/main/java/org/apache/ignite/internal/cli/call/cluster/status/ClusterState.java: ########## @@ -38,15 +39,26 @@ public class ClusterStatus { private final List<String> metadataStorageNodes; - private ClusterStatus(int nodeCount, boolean initialized, String name, - boolean connected, String nodeUrl, List<String> cmgNodes, List<String> metadataStorageNodes) { + private final ClusterStatus clusterStatus; + + private ClusterState( + int nodeCount, + boolean initialized, + String name, + boolean connected, + String nodeUrl, + List<String> cmgNodes, + List<String> metadataStorageNodes, + ClusterStatus clusterStatus + ) { this.nodeCount = nodeCount; this.initialized = initialized; this.name = name; this.connected = connected; this.nodeUrl = nodeUrl; this.cmgNodes = cmgNodes; this.metadataStorageNodes = metadataStorageNodes; Review Comment: How about making defensive copies? Better safe than sorry, and this class should not have a huge amount of instances ########## modules/rest-api/src/main/java/org/apache/ignite/internal/rest/api/cluster/ClusterStatus.java: ########## @@ -0,0 +1,40 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.ignite.internal.rest.api.cluster; + +/** + * Current health status of the cluster. + */ +public enum ClusterStatus { + /** + * The cluster is completely healthy. Minor losses in any of the groups are possible, + * but this does not affect the operation of the cluster. + */ + HEALTHY, + + /** + * The metastore group has lost its majority. Almost all of the cluster functions are inoperative. + * To restore their operation, it is necessary to return the majority to the metastore group. Review Comment: ```suggestion * To restore their operation, it is necessary to restore the majority to the metastore group. ``` ########## modules/rest/src/main/java/org/apache/ignite/internal/rest/cluster/ClusterManagementController.java: ########## @@ -84,16 +113,31 @@ public CompletableFuture<Void> init(@Body InitCommand initCommand) { }); } - private static ClusterState mapClusterState(org.apache.ignite.internal.cluster.management.ClusterState clusterState) { + private ClusterState mapClusterState(org.apache.ignite.internal.cluster.management.ClusterState clusterState) { return new ClusterState( clusterState.cmgNodes(), clusterState.metaStorageNodes(), clusterState.igniteVersion().toString(), new ClusterTag(clusterState.clusterTag().clusterName(), clusterState.clusterTag().clusterId()), - clusterState.formerClusterIds() + clusterState.formerClusterIds(), + mapClusterStatus(clusterState) ); } + private ClusterStatus mapClusterStatus(org.apache.ignite.internal.cluster.management.ClusterState clusterState) { + Set<String> metaStorageNodes = clusterState.metaStorageNodes(); + long presentMetaStorageNodes = topologyService.allMembers().stream() + .map(ClusterNode::name) + .filter(metaStorageNodes::contains) + .count(); + + if (presentMetaStorageNodes <= metaStorageNodes.size() / 2) { + return ClusterStatus.MS_MAJORITY_LOST; Review Comment: The fact that we have enough nodes in the physical topology does not necessarily imply we have an MG majority. For instance, something on levels higher than the network might prevent a leader to be elected. The most accurate way would be to try executing a read command against the MS group (like reading its current revision) and handle a timeout. ########## modules/cli/src/main/java/org/apache/ignite/internal/cli/call/cluster/status/ClusterState.java: ########## @@ -18,11 +18,12 @@ package org.apache.ignite.internal.cli.call.cluster.status; import java.util.List; +import org.apache.ignite.rest.client.model.ClusterStatus; /** * Class that represents the cluster status. */ -public class ClusterStatus { +public class ClusterState { Review Comment: We already have an entity called 'cluster state' in the CMG domain. This one is a different entity and yet it's called identically. Could a distinct name be devised? Something like ClusterRuntimeState, maybe. ########## modules/rest-api/src/main/java/org/apache/ignite/internal/rest/api/cluster/ClusterStatus.java: ########## @@ -0,0 +1,40 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.ignite.internal.rest.api.cluster; + +/** + * Current health status of the cluster. + */ +public enum ClusterStatus { + /** + * The cluster is completely healthy. Minor losses in any of the groups are possible, + * but this does not affect the operation of the cluster. + */ + HEALTHY, + + /** + * The metastore group has lost its majority. Almost all of the cluster functions are inoperative. + * To restore their operation, it is necessary to return the majority to the metastore group. + */ + MS_MAJORITY_LOST, + + /** + * The cluster management group has lost its majority. The cluster is completely inoperative until the majority is returned. Review Comment: ```suggestion * The cluster management group has lost its majority. The cluster is completely inoperative until the majority is restored. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@ignite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org