[ https://issues.apache.org/jira/browse/YUNIKORN-2864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17935015#comment-17935015 ]
Craig Condit edited comment on YUNIKORN-2864 at 3/12/25 11:06 PM: ------------------------------------------------------------------ [~kaichiaboy] I've updated the description of this JIRA to reflect the approach we need to take with Kubernetes 1.32 and beyond, as well as added a dependency on YUNIKORN-3053 (which updates the algorithm YuniKorn uses internally). Once the PR for that merges, you should be able to start implementing this if you're able to start on it. If not, I'll take over the issue and implement the e2e tests myself. was (Author: ccondit): [~kaichiaboy] I've updated the description of this JIRA to reflect the approach we need to take with Kubernetes 1.32 and beyond, as well as added a dependency on YUNIKORN-3053 (which updates the algorithm YuniKorn uses internally). Once the PR for that merges, you should be able to start implementing this. > Add e2e tests for InPlacePodVerticalScaling feature > --------------------------------------------------- > > Key: YUNIKORN-2864 > URL: https://issues.apache.org/jira/browse/YUNIKORN-2864 > Project: Apache YuniKorn > Issue Type: Sub-task > Components: shim - kubernetes > Reporter: Craig Condit > Assignee: Kaichia Chen > Priority: Major > > Build e2e tests to verify YuniKorn behavior when the > InPlacePodVerticalScaling feature flag is enabled. This is possible in K8s > 1.32 and later. Tests should be skipped on K8s 1.31 or earlier, or if the > feature flag is not enabled. > Proposal: > We should create a new suite for the tests. As part of the suite > initialization, we should create a small pod, and attempt to resize it (see > https://kubernetes.io/docs/tasks/configure-pod-container/resize-container-resources/ > for an example of how to do this. Basically, a PATCH request needs to be > sent to the subresource /resize of the pod on the Kubernetes REST API). > The initial pod should simply ask for something like 100m CPU, and attempt to > resize down to 50m. If that fails (for any reason) skip the rest of the > suite, as pod resizing is unavailable in the current cluster. > If the pre-check succeeds, then continue with the suite. At a minimum we > should validate: > * Pod resize up: > *# Schedule a pod with 100m CPU > *# Validate that YuniKorn schedules the pod. Verify that the pod's resource > utilization in the YuniKorn REST API is 100m CPU > *# Patch the pod's /resize sub-resource with CPU: 200m > *# Wait for the pod.Status.ContainerStatuses[0].Resources.Requests to contain > the updated value (with timeout) > *# Wait for YuniKorn REST API to report 200m CPU used for the pod (with > timeout) > * Pod resize down: > *# Schedule a pod with 200m CPU > *# Validate that YuniKorn schedules the pod. Verify that the pod's resource > utilization in the YuniKorn REST API is 200m CPU > *# Patch the pod's /resize sub-resource with CPU: 100m > *# Wait for the pod.Status.ContainerStatuses[0].Resources.Requests to contain > the updated value (with timeout) > *# Wait for YuniKorn REST API to report 100m CPU used for the pod (with > timeout) > * Pod resize failure: > *# Schedule a pod with 100m CPU > *# Validate that YuniKorn schedules the pod. Verify that the pod's resource > utilization in the YuniKorn REST API is 100m CPU > *# Patch the pod's /resize sub-resource with CPU: 100000 > *# Wait for pod.Status.Resize to be equal to "Infeasible" (with timeout) > *#* Note: This will probably need to change to look for a Pod condition after > 1.33, as pod.Status.Resize will be deprecated > *# Wait for YuniKorn REST API to report 100m CPU used for the pod (with > timeout) -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org For additional commands, e-mail: issues-h...@yunikorn.apache.org