Ruiwen Zhao created YUNIKORN-3128:
-------------------------------------
Summary: Yunikorn ignores pending pods after apiserver errors
Key: YUNIKORN-3128
URL: https://issues.apache.org/jira/browse/YUNIKORN-3128
Project: Apache YuniKorn
Issue Type: Bug
Components: core - scheduler
Affects Versions: 1.7.0
Environment: EKS 1.31
Reporter: Ruiwen Zhao
We are running some load testing with Yunikorn, where pods are created at 200/s
and we monitor if Yunikorn can schedule them at the same rate.
One issue we saw is that Yunikorn ends up ignoring a bunch of (~2000) pods at
the end of the load test, and completes the application. As shown below, there
are many pods still Pending, but Yunikorn completes the application they belong
to, and therefore those pods are stuck. All the pods has "schedulerName:
yunikorn".
{code:java}
❯ kc get pods -n spark8s-kube-burner-yunikorn | grep Pending | head
kube-burner-0-0-82077 0/1 Pending 0 16m
kube-burner-0-0-82105 0/1 Pending 0 16m
kube-burner-0-0-82129 0/1 Pending 0 16m
kube-burner-0-0-82132 0/1 Pending 0 16m
kube-burner-0-0-82140 0/1 Pending 0 16m
kube-burner-0-0-82141 0/1 Pending 0 16m
2025-09-29T18:28:18.866Z INFO core.scheduler.fsm
objects/application_state.go:147 Application state transition {"appID":
"yunikorn-spark8s-kube-burner-yunikorn-0", "source": "Completing",
"destination": "Completed", "event": "completeApplication"} {code}
When looking at one of the Pending pods (kube-burner-0-0-82077), we can
Yunikorn was trying to schedule it, but failed to do so because of the etcd
errors. Yunikorn retried once, failed again, and then Yunikorn submitted the
task again, but no log after that:
{code:java}
2025-09-30T21:18:45.248Z INFO shim.fsm cache/task_state.go:381 Task state
transition {"app": "yunikorn-spark8s-kube-burner-yunikorn-0", "task":
"731dc815-9ee0-4767-a5a9-939219b94f6e", "taskAlias":
"spark8s-kube-burner-yunikorn/kube-burner-0-0-82077", "source": "New",
"destination": "Pending", "event": "InitTask"}
2025-09-30T21:18:45.260Z INFO shim.fsm cache/task_state.go:381 Task state
transition {"app": "yunikorn-spark8s-kube-burner-yunikorn-0", "task":
"731dc815-9ee0-4767-a5a9-939219b94f6e", "taskAlias":
"spark8s-kube-burner-yunikorn/kube-burner-0-0-82077", "source": "Pending",
"destination": "Scheduling", "event": "SubmitTask"}
2025-09-30T21:18:59.464Z ERROR shim.client client/kubeclient.go:127 failed to
bind pod {"namespace": "spark8s-kube-burner-yunikorn", "podName":
"kube-burner-0-0-82077", "error": "Operation cannot be fulfilled on
pods/binding \"kube-burner-0-0-82077\": etcdserver: request timed out"}
2025-09-30T21:18:59.465Z ERROR shim.cache.task cache/task.go:464 task failed
{"appID": "yunikorn-spark8s-kube-burner-yunikorn-0", "taskID":
"731dc815-9ee0-4767-a5a9-939219b94f6e", "reason": "bind pod to node failed,
name: spark8s-kube-burner-yunikorn/kube-burner-0-0-82077, Operation cannot be
fulfilled on pods/binding \"kube-burner-0-0-82077\": etcdserver: request timed
out"}
2025-09-30T21:18:59.465Z INFO shim.fsm cache/task_state.go:381 Task state
transition {"app": "yunikorn-spark8s-kube-burner-yunikorn-0", "task":
"731dc815-9ee0-4767-a5a9-939219b94f6e", "taskAlias":
"spark8s-kube-burner-yunikorn/kube-burner-0-0-82077", "source": "Allocated",
"destination": "Failed", "event": "TaskFail"}
2025-09-30T21:18:59.464Z ERROR shim.cache.task cache/task.go:388 bind pod to
node failed {"taskID": "731dc815-9ee0-4767-a5a9-939219b94f6e", "error":
"Operation cannot be fulfilled on pods/binding \"kube-burner-0-0-82077\":
etcdserver: request timed out"}
2025-09-30T21:19:25.464Z INFO shim.fsm cache/task_state.go:381 Task state
transition {"app": "yunikorn-spark8s-kube-burner-yunikorn-0", "task":
"731dc815-9ee0-4767-a5a9-939219b94f6e", "taskAlias":
"spark8s-kube-burner-yunikorn/kube-burner-0-0-82077", "source": "New",
"destination": "Pending", "event": "InitTask"}
2025-09-30T21:19:25.464Z INFO shim.fsm cache/task_state.go:381 Task state
transition {"app": "yunikorn-spark8s-kube-burner-yunikorn-0", "task":
"731dc815-9ee0-4767-a5a9-939219b94f6e", "taskAlias":
"spark8s-kube-burner-yunikorn/kube-burner-0-0-82077", "source": "Pending",
"destination": "Scheduling", "event": "SubmitTask"} {code}
The failure seems to be caused by etcd timeout, which makes sense, but IMO the
expected behavior is that Yunikorn keeps trying to schedule the pods with
backoff.
Yunkorn version: 1.7.0
Env: EKS 1.31
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]