Peter Bacsko created YUNIKORN-2766:
--------------------------------------
Summary: Only generate event if all predicates failed
Key: YUNIKORN-2766
URL: https://issues.apache.org/jira/browse/YUNIKORN-2766
Project: Apache YuniKorn
Issue Type: Improvement
Components: core - scheduler
Reporter: Peter Bacsko
Assignee: Peter Bacsko
Right now, we send an event to the pod if a predicate failed:
{noformat}
if err := plugin.Predicates(&si.PredicatesArgs{
AllocationKey: allocationKey,
NodeID: sn.NodeID,
Allocate: allocate,
}); err != nil {
log.Log(log.SchedNode).Debug("running predicates
failed",
zap.String("allocationKey", allocationKey),
zap.String("nodeID", sn.NodeID),
zap.Bool("allocateFlag", allocate),
zap.Error(err))
// running predicates failed
msg := err.Error()
ask.LogAllocationFailure(msg, allocate)
ask.SendPredicateFailedEvent(msg)
return false
}
{noformat}
This is, however, not correct. We should only generate an event if *all*
predicates have failed, which means that the pod cannot be scheduled. A failing
predicate for a given node can be perfectly normal in many cases.
Instead, we should aggregate the failed predicates and send an event like:
{noformat}
All predicates failed for request '345d70d7-243a-4077-a9f8-0bb76c3532d7':
node(s) didn't match Pod's node affinity/selector (20x), node(s) had taints
that the pod didn't tolerate (5x)
{noformat}
where 20x and 5x tell how many times a certain predicate failed.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]