Piyush Narang created FLINK-14029:
-------------------------------------

             Summary: Update Flink's Mesos scheduling behavior to reject all 
expired offers
                 Key: FLINK-14029
                 URL: https://issues.apache.org/jira/browse/FLINK-14029
             Project: Flink
          Issue Type: Bug
            Reporter: Piyush Narang


While digging into why our Flink jobs weren't being scheduled on our internal 
Mesos setup we noticed that we were hitting Mesos quota limits tied to the way 
we've set up the Fenzo (https://github.com/Netflix/Fenzo/) library defaults in 
the Flink project. 

Behavior we noticed was that we got a bunch of offers from our Mesos master 
(50+) out of which only 1 or 2 of them were super skewed and took up a huge 
chunk of our disk resource quota. Thanks to this we were not sent any new / 
different offers (as our usage at the time + resource offers reached our Mesos 
disk quota). As the Flink / Fenzo Mesos scheduling code was not using the 1-2 
skewed disk offers they end up expiring. The way we've set up the Fenzo 
scheduler is to use the default values on when to expire unused offers (120s) 
and maximum number of unused offer leases at a time (4). Unfortunately as we 
have a considerable number of outstanding expired offers (50+) we end up in a 
situation where we reject only 4 or so every 2 mins and we never get around to 
rejecting the super skewed disk ones which are stopping us from scheduling our 
Flink job. Thanks to this we end up in a situation where our job is waiting to 
be scheduled for more than an hour. 

An option to work around this is to reject all expired offers at 2 minute 
expiry time rather than hold on to them. This will allow Mesos to send 
alternate offers that might be scheduled by Fenzo. 




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to