Wilfred Spiegelenburg created YUNIKORN-2737:
-----------------------------------------------

             Summary: Cleanup handleFailApplicationEvent handling
                 Key: YUNIKORN-2737
                 URL: https://issues.apache.org/jira/browse/YUNIKORN-2737
             Project: Apache YuniKorn
          Issue Type: Improvement
          Components: shim - kubernetes
            Reporter: Wilfred Spiegelenburg


When we handle a failed application in the shim in 
{{handleFailApplicationEvent()}} we call the placeholder cleanup.
Three issues:
 * The cleanup needs the app lock after it takes the mgr lock. The app lock is 
already held when we process the event. Should be placing the cleanup last to 
not hold the manager lock for longer than needed
 * failing an application is triggered by the core which should do the cleanup 
already so this might be redundant to start with.
 * The failure handling also marks unassigned pods as failed which means there 
is an overlap between the failure handling and the placeholder cleanup which we 
should remove. Either ignore all placeholders in the failure or only cleanup 
assigned placeholders.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to