[GitHub] [solr-operator] HoustonPutman commented on a diff in pull request #561: Add managed scale-down for SolrClouds

via GitHub Wed, 17 May 2023 06:57:47 -0700


HoustonPutman commented on code in PR #561:
URL: https://github.com/apache/solr-operator/pull/561#discussion_r1196567570



##########
controllers/solr_cluster_ops_util.go:
##########
@@ -0,0 +1,218 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package controllers
+
+import (
+       "context"
+       "errors"
+       solrv1beta1 "github.com/apache/solr-operator/api/v1beta1"
+       "github.com/apache/solr-operator/controllers/util"
+       "github.com/go-logr/logr"
+       appsv1 "k8s.io/api/apps/v1"
+       corev1 "k8s.io/api/core/v1"
+       "k8s.io/utils/pointer"
+       "sigs.k8s.io/controller-runtime/pkg/client"
+       "strconv"
+       "time"
+)
+
+func determineScaleClusterOpLockIfNecessary(ctx context.Context, r 
*SolrCloudReconciler, instance *solrv1beta1.SolrCloud, statefulSet 
*appsv1.StatefulSet, podList []corev1.Pod, logger logr.Logger) (clusterOpLock 
string, clusterOpMetadata string, retryLaterDuration time.Duration, err error) {
+       desiredPods := int(*instance.Spec.Replicas)
+       configuredPods := int(*statefulSet.Spec.Replicas)
+       if desiredPods != configuredPods {
+               scaleTo := -1
+               // Start a scaling operation
+               if desiredPods < configuredPods {
+                       // Scale down!
+                       // The option is enabled by default, so treat "nil" 
like "true"
+                       if instance.Spec.Autoscaling.VacatePodsOnScaleDown == 
nil || *instance.Spec.Autoscaling.VacatePodsOnScaleDown {
+                               if desiredPods > 0 {
+                                       // We only support one scaling down one 
pod at-a-time if not scaling down to 0 pods
+                                       scaleTo = configuredPods - 1
+                               } else {
+                                       scaleTo = 0
+                               }
+                       } else {
+                               // The cloud is not setup to use managed 
scale-down
+                               err = scaleCloudUnmanaged(ctx, r, statefulSet, 
desiredPods, logger)
+                       }
+               } else if desiredPods > configuredPods {
+                       // Scale up!
+                       // TODO: replicasScaleUp is not supported, so do not 
make a clusterOp out of it, just do the patch
+                       err = scaleCloudUnmanaged(ctx, r, statefulSet, 
desiredPods, logger)
+               }
+               if scaleTo > -1 {
+                       clusterOpLock = util.ScaleLock
+                       clusterOpMetadata = strconv.Itoa(scaleTo)
+               }
+       }
+       return
+}
+
+func handleLockedClusterOpScale(ctx context.Context, r *SolrCloudReconciler, 
instance *solrv1beta1.SolrCloud, statefulSet *appsv1.StatefulSet, podList 
[]corev1.Pod, logger logr.Logger) (retryLaterDuration time.Duration, err error) 
{
+       if scalingToNodes, hasAnn := 
statefulSet.Annotations[util.ClusterOpsMetadataAnnotation]; hasAnn {
+               if scalingToNodesInt, convErr := strconv.Atoi(scalingToNodes); 
convErr != nil {
+                       logger.Error(convErr, "Could not convert statefulSet 
annotation to int for scale-down-to information", "annotation", 
util.ClusterOpsMetadataAnnotation, "value", scalingToNodes)
+                       err = convErr
+               } else {
+                       replicaManagementComplete := false
+                       if scalingToNodesInt < int(*statefulSet.Spec.Replicas) {
+                               // Manage scaling down the SolrCloud
+                               replicaManagementComplete, err = 
handleManagedCloudScaleDown(ctx, r, instance, statefulSet, scalingToNodesInt, 
podList, logger)
+                               // } else if scalingToNodesInt > 
int(*statefulSet.Spec.Replicas) {
+                               // TODO: Utilize the scaled-up nodes in the 
future, however Solr does not currently have APIs for this.
+                               // TODO: Think about the order of scale-up and 
restart when individual nodeService IPs are injected into the pods.
+                               // TODO: Will likely want to do a scale-up of 
the service first, then do the rolling restart of the cluster, then utilize the 
node.
+                       } else {
+                               // This shouldn't happen. The 
ScalingToNodesAnnotation is removed when the statefulSet size changes, through 
a Patch.
+                               // But if it does happen, we should just remove 
the annotation and move forward.
+                               patchedStatefulSet := statefulSet.DeepCopy()
+                               delete(patchedStatefulSet.Annotations, 
util.ClusterOpsLockAnnotation)
+                               delete(patchedStatefulSet.Annotations, 
util.ClusterOpsMetadataAnnotation)
+                               if err = r.Patch(ctx, patchedStatefulSet, 
client.StrategicMergeFrom(statefulSet)); err != nil {
+                                       logger.Error(err, "Error while patching 
StatefulSet to remove unneeded clusterLockOp annotation for scaling to the 
current amount of nodes")
+                               } else {
+                                       statefulSet = patchedStatefulSet
+                               }
+                       }
+
+                       // Scale down the statefulSet to represent the new 
number of utilizedPods, if it is lower than the current number of pods
+                       // Also remove the "scalingToNodes" annotation, as that 
acts as a lock on the cluster, so that other operations,
+                       // such as scale-up, pod updates and further scale-down 
cannot happen at the same time.
+                       if replicaManagementComplete {
+                               patchedStatefulSet := statefulSet.DeepCopy()
+                               patchedStatefulSet.Spec.Replicas = 
pointer.Int32(int32(scalingToNodesInt))
+                               delete(patchedStatefulSet.Annotations, 
util.ClusterOpsLockAnnotation)
+                               delete(patchedStatefulSet.Annotations, 
util.ClusterOpsMetadataAnnotation)
+                               if err = r.Patch(ctx, patchedStatefulSet, 
client.StrategicMergeFrom(statefulSet)); err != nil {
+                                       logger.Error(err, "Error while patching 
StatefulSet to scale down SolrCloud", "newUtilizedNodes", scalingToNodesInt)
+                               }
+
+                               // TODO: Create event for the CRD.
+                       } else {
+                               // Retry after five minutes to check if the 
replica management commands have been completed
+                               retryLaterDuration = time.Second * 5
+                       }
+               }
+               // If everything succeeded, the statefulSet will have an 
annotation updated
+               // and the reconcile loop will be called again.
+
+               return
+       } else {
+               err = errors.New("no clusterOpMetadata annotation is present in 
the statefulSet")
+               logger.Error(err, "Cannot perform scaling operation when no 
scale-to-nodes is provided via the clusterOpMetadata")
+               return time.Second * 10, err
+       }
+}
+
+// handleManagedCloudScaleDown does the logic of a managed and "locked" cloud 
scale down operation.
+// This will likely take many reconcile loops to complete, as it is moving 
replicas away from the nodes that will be scaled down.
+func handleManagedCloudScaleDown(ctx context.Context, r *SolrCloudReconciler, 
instance *solrv1beta1.SolrCloud, statefulSet *appsv1.StatefulSet, scaleDownTo 
int, podList []corev1.Pod, logger logr.Logger) (replicaManagementComplete bool, 
err error) {
+       // Before doing anything to the pod, make sure that users cannot send 
requests to the pod anymore.
+       podStoppedReadinessConditions := 
map[corev1.PodConditionType]podReadinessConditionChange{
+               util.SolrIsNotStoppedReadinessCondition: {
+                       reason:  ScaleDown,
+                       message: "Pod is being deleted, traffic to the pod must 
be stopped",
+                       status:  false,
+               },
+       }
+
+       if scaleDownTo == 0 {
+               // Delete all collections & data, the user wants no data left 
if scaling the solrcloud down to 0
+               // This is a much different operation to deleting the 
SolrCloud/StatefulSet all-together
+               replicaManagementComplete, err = evictAllPods(ctx, r, instance, 
podList, podStoppedReadinessConditions, logger)

Review Comment:
   Wonderful question. So I should make this more clear here, rather than 
obfuscating it.
   
   Currently I have not actually implemented data deletion. evictAllPods is 
basically a NO-OP, that I left around for future improvement. If you scale down 
to 0, the operator just leaves the data and deletes all of the pods. This is 
actually the same as Deleting the StatefulSet (Though slightly different with 
regard to PVC lifespan). The only way the data would still live is if you are 
using persistent storage and using the `RetainOnDelete` option for PVCs.
   
   The idea was that maybe we should just explicitly delete all collections 
before scaling to 0, since they might want to scale back up and at that point 
the collections in ZK will no longer exist. I didn't implement this and left it 
for a future option. But we can make this a blocker for the next release if you 
think its worth-while.
   
   I'm also just unsure if collection deletion is what we should be doing. 
Scaling down nodes to me doesn't include deleting data, now that I'm thinking 
about it more. Either way though, this code path should be made more clear as 
to what it actually does, which is basically a NO-OP currently.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[GitHub] [solr-operator] HoustonPutman commented on a diff in pull request #561: Add managed scale-down for SolrClouds

Reply via email to