So, to summarize from large cluster perspective the resources are held to
complete some expensive operations and avoid doing it frequently that also
blocks the callback thread.
That brings an interesting point. Is there any out of box instrumentation on
the scheduler performance that one can sim
Aurora holds offers for a few reasons:
- To avoid blocking the mesos driver callback thread while matching offers
to pending tasks
- To enable preemption (determine whether cluster resources are exhausted,
and a low priority task should be evicted for a high priority one)
- To perform optimize sche
We observed that Aurora's CPU share in Mesos Master dashboard spikes upto
100%, even when Aurora is not running any Job. Looking at the code, I
figured that scheduler holds on to the resourceOffers, even if there are no
tasks to be scheduled and doesn't decline the offer immediately.
It looks like