[ 
https://issues.apache.org/jira/browse/FLINK-8886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16405753#comment-16405753
 ] 

Elias Levy commented on FLINK-8886:
-----------------------------------

Thanks.  I am aware of YARN, but we do want to manage YARN just to run Flink.  
I we wanted to go down the route of using a generic resource manager, we'd use 
Mesos, but we don't have a need to do.

As I alluded to, we already run multiple Flink clusters in standalone mode to 
gain isolation, but this is usually wasteful, as the JM is usually lightly 
loaded and you need at least two of them for high-availability.  So it would be 
useful to share JMs while jobs are isolated to TMs, which is what I am 
proposing.

As for the complexity of the proposal, I'd argue that is is relatively 
lightweight in the restrictive mode.  Just permit the TM to register a set of 
tags placed in the config file, then have the scheduler only schedule jobs on 
TMs that have an exact match for the tags. 

> Job isolation via scheduling in shared cluster
> ----------------------------------------------
>
>                 Key: FLINK-8886
>                 URL: https://issues.apache.org/jira/browse/FLINK-8886
>             Project: Flink
>          Issue Type: Improvement
>          Components: Distributed Coordination, Local Runtime, Scheduler
>    Affects Versions: 1.5.0
>            Reporter: Elias Levy
>            Priority: Major
>
> Flink's TaskManager executes tasks from different jobs within the same JMV as 
> threads.  We prefer to isolate different jobs on their on JVM.  Thus, we must 
> use different TMs for different jobs.  As currently the scheduler will 
> allocate task slots within a TM to tasks from different jobs, that means we 
> must stand up one cluster per job.  This is wasteful, as it requires at least 
> two JobManagers per cluster for high-availability, and the JMs have low 
> utilization.
> Additionally, different jobs may require different resources.  Some jobs are 
> compute heavy.  Some are IO heavy (lots of state in RocksDB).  At the moment 
> the scheduler threats all TMs are equivalent, except possibly in their number 
> of available task slots.  Thus, one is required to stand up multiple cluster 
> if there is a need for different types of TMs.
>  
> It would be useful if one could specify requirements on job, such that they 
> are only scheduled on a subset of TMs.  Properly configured, that would 
> permit isolation of jobs in a shared cluster and scheduling of jobs with 
> specific resource needs.
>  
> One possible implementation is to specify a set of tags on the TM config file 
> which the TMs used when registering with the JM, and another set of tags 
> configured within the job or supplied when submitting the job.  The scheduler 
> could then match the tags in the job with the tags in the TMs.  In a 
> restrictive mode the scheduler would assign a job task to a TM only if all 
> tags match.  In a relaxed mode the scheduler could assign a job task to a TM 
> if there is a partial match, while giving preference to a more accurate match.
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to