Ruiwen Zhao created YUNIKORN-3128:
-------------------------------------

             Summary: Yunikorn ignores pending pods after apiserver errors
                 Key: YUNIKORN-3128
                 URL: https://issues.apache.org/jira/browse/YUNIKORN-3128
             Project: Apache YuniKorn
          Issue Type: Bug
          Components: core - scheduler
    Affects Versions: 1.7.0
         Environment: EKS 1.31
            Reporter: Ruiwen Zhao


We are running some load testing with Yunikorn, where pods are created at 200/s 
and we monitor if Yunikorn can schedule them at the same rate. 

 

One issue we saw is that Yunikorn ends up ignoring a bunch of (~2000) pods at 
the end of the load test, and completes the application. As shown below, there 
are many pods still Pending, but Yunikorn completes the application they belong 
to, and therefore those pods are stuck. All the pods has "schedulerName: 
yunikorn".

 
{code:java}
❯ kc get pods -n spark8s-kube-burner-yunikorn | grep Pending  | head
kube-burner-0-0-82077   0/1     Pending   0          16m
kube-burner-0-0-82105   0/1     Pending   0          16m
kube-burner-0-0-82129   0/1     Pending   0          16m
kube-burner-0-0-82132   0/1     Pending   0          16m
kube-burner-0-0-82140   0/1     Pending   0          16m
kube-burner-0-0-82141   0/1     Pending   0          16m

2025-09-29T18:28:18.866Z    INFO    core.scheduler.fsm    
objects/application_state.go:147    Application state transition    {"appID": 
"yunikorn-spark8s-kube-burner-yunikorn-0", "source": "Completing", 
"destination": "Completed", "event": "completeApplication"} {code}
When looking at one of the Pending pods (kube-burner-0-0-82077), we can 
Yunikorn was trying to schedule it, but failed to do so because of the etcd 
errors. Yunikorn retried once, failed again, and then Yunikorn submitted the 
task again, but no log after that:
{code:java}
2025-09-30T21:18:45.248Z INFO shim.fsm cache/task_state.go:381 Task state 
transition {"app": "yunikorn-spark8s-kube-burner-yunikorn-0", "task": 
"731dc815-9ee0-4767-a5a9-939219b94f6e", "taskAlias": 
"spark8s-kube-burner-yunikorn/kube-burner-0-0-82077", "source": "New", 
"destination": "Pending", "event": "InitTask"}

2025-09-30T21:18:45.260Z INFO shim.fsm cache/task_state.go:381 Task state 
transition {"app": "yunikorn-spark8s-kube-burner-yunikorn-0", "task": 
"731dc815-9ee0-4767-a5a9-939219b94f6e", "taskAlias": 
"spark8s-kube-burner-yunikorn/kube-burner-0-0-82077", "source": "Pending", 
"destination": "Scheduling", "event": "SubmitTask"} 

2025-09-30T21:18:59.464Z ERROR shim.client client/kubeclient.go:127 failed to 
bind pod {"namespace": "spark8s-kube-burner-yunikorn", "podName": 
"kube-burner-0-0-82077", "error": "Operation cannot be fulfilled on 
pods/binding \"kube-burner-0-0-82077\": etcdserver: request timed out"}

2025-09-30T21:18:59.465Z ERROR shim.cache.task cache/task.go:464 task failed 
{"appID": "yunikorn-spark8s-kube-burner-yunikorn-0", "taskID": 
"731dc815-9ee0-4767-a5a9-939219b94f6e", "reason": "bind pod to node failed, 
name: spark8s-kube-burner-yunikorn/kube-burner-0-0-82077, Operation cannot be 
fulfilled on pods/binding \"kube-burner-0-0-82077\": etcdserver: request timed 
out"} 

2025-09-30T21:18:59.465Z INFO shim.fsm cache/task_state.go:381 Task state 
transition {"app": "yunikorn-spark8s-kube-burner-yunikorn-0", "task": 
"731dc815-9ee0-4767-a5a9-939219b94f6e", "taskAlias": 
"spark8s-kube-burner-yunikorn/kube-burner-0-0-82077", "source": "Allocated", 
"destination": "Failed", "event": "TaskFail"} 

2025-09-30T21:18:59.464Z ERROR shim.cache.task cache/task.go:388 bind pod to 
node failed {"taskID": "731dc815-9ee0-4767-a5a9-939219b94f6e", "error": 
"Operation cannot be fulfilled on pods/binding \"kube-burner-0-0-82077\": 
etcdserver: request timed out"}

2025-09-30T21:19:25.464Z INFO shim.fsm cache/task_state.go:381 Task state 
transition {"app": "yunikorn-spark8s-kube-burner-yunikorn-0", "task": 
"731dc815-9ee0-4767-a5a9-939219b94f6e", "taskAlias": 
"spark8s-kube-burner-yunikorn/kube-burner-0-0-82077", "source": "New", 
"destination": "Pending", "event": "InitTask"}

2025-09-30T21:19:25.464Z INFO shim.fsm cache/task_state.go:381 Task state 
transition {"app": "yunikorn-spark8s-kube-burner-yunikorn-0", "task": 
"731dc815-9ee0-4767-a5a9-939219b94f6e", "taskAlias": 
"spark8s-kube-burner-yunikorn/kube-burner-0-0-82077", "source": "Pending", 
"destination": "Scheduling", "event": "SubmitTask"} {code}
The failure seems to be caused by etcd timeout, which makes sense, but IMO the 
expected behavior is that Yunikorn keeps trying to schedule the pods with 
backoff. 

 

Yunkorn version: 1.7.0

Env: EKS 1.31



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to