[
https://issues.apache.org/jira/browse/YUNIKORN-3123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peter Bacsko resolved YUNIKORN-3123.
------------------------------------
Fix Version/s: 1.8.0
Resolution: Fixed
> Add retry logic to AssumePod to prevent PV races
> ------------------------------------------------
>
> Key: YUNIKORN-3123
> URL: https://issues.apache.org/jira/browse/YUNIKORN-3123
> Project: Apache YuniKorn
> Issue Type: Bug
> Components: shim - kubernetes
> Reporter: Peter Bacsko
> Assignee: Peter Bacsko
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.8.0
>
>
> Internally we ran into a strange problem which occurs on OpenShift. It seems
> to be related to how ephemeral volumes are handled by LSO (Local Storage
> Operator).
> {noformat}
> │ Events:
>
> │
> │ Type Reason Age From Message
>
> │
> │ ---- ------ ---- ---- -------
>
> │
> │ Normal Scheduling 22m yunikorn
> impala-1755495449-zgbl/impala-executor-000-0 is queued and waiting for
> allocation │
> │ Warning AssumePodError 22m yunikorn pod impala-executor-000-0 has
> conflicting volume claims: node(s) didn't find available persistent volumes
> to bind │
> │ Normal TaskFailed 22m yunikorn Task
> impala-1755495449-zgbl/impala-executor-000-0 is failed
> {noformat}
> The underlying issue is very likely a race condition between two separate
> volumeBinder instances. The one inside the {{VolumeBinding}} plugin already
> sees the volume when the predicates are evaluated, so the node is seen as fit
> for the a given pod. After the core completes the scheduling,
> {{context.AssumePod()}} is called with yet another call to
> {{SchedulerVolumeBinder.FindPodVolumes()}}. However, this instance hasn't
> received the update about the volumes being ready, and it returns an error.
> This also means that the bug is very sensitive to network latencies.
>
> It's difficult to reproduce. Our suggestion is adding a simple retry logic
> around {{AssumePod()}}.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]