Peter Bacsko created YUNIKORN-3123:
--------------------------------------

             Summary: Add retry logic to AssumePod to prevent PV races
                 Key: YUNIKORN-3123
                 URL: https://issues.apache.org/jira/browse/YUNIKORN-3123
             Project: Apache YuniKorn
          Issue Type: Bug
          Components: shim - kubernetes
            Reporter: Peter Bacsko
            Assignee: Peter Bacsko


Internally we ran into a strange problem which occurs on OpenShift. It seems to 
be related to how ephemeral volumes are handled by LSO (Local Storage Operator).
{noformat}
│ Events:                                                                       
                                                                                
                         │
│   Type     Reason          Age   From      Message                            
                                                                                
                         │
│   ----     ------          ----  ----      -------                            
                                                                                
                         │
│   Normal   Scheduling      22m   yunikorn  
impala-1755495449-zgbl/impala-executor-000-0 is queued and waiting for 
allocation                                                           │
│   Warning  AssumePodError  22m   yunikorn  pod impala-executor-000-0 has 
conflicting volume claims: node(s) didn't find available persistent volumes to 
bind                           │
│   Normal   TaskFailed      22m   yunikorn  Task 
impala-1755495449-zgbl/impala-executor-000-0 is failed
{noformat}
The underlying issue is very likely a race condition between two separate 
volumeBinder instances. The one inside the {{VolumeBinding}} plugin already 
sees the volume when the predicates are evaluated, so the node is seen as fit 
for the a given pod. After the core completes the scheduling, 
{{context.AssumePod()}} is called with yet another call to
{{SchedulerVolumeBinder.FindPodVolumes()}}. However, this instance hasn't 
received the update about the volumes being ready, and it returns an error. 
This also means that the bug is very sensitive to network latencies.
 
It's difficult to reproduce. Our suggestion is adding a simple retry logic 
around {{AssumePod()}}.
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to