One option is to use a separate cluster (JobManager + TaskManagers) for each job. This is fairly straightforward with the YARN support - "flink run” can launch a cluster for a job and tear it down afterwards.
Of course this means you must deploy YARN. That doesn’t necessarily imply HDFS though a Hadoop-compatible filesystem (HCFS) is needed to support the YARN staging directory. This approach also facilitates richer scheduling and multi-user scenarios. One downside is the loss of a unified web UI to view all jobs. > On May 11, 2016, at 8:32 AM, Jark Wu <wuchong...@alibaba-inc.com> wrote: > > > As I know, Flink uses thread model, that means one TaskManager process may > run many different operator threads from different jobs. So tasks from > different jobs will compete for memory and CPU in the one process. In the > worst case scenario, the bad job will eat most of CPU and memroy which may > lead to OOM, and then the regular job died too. And there's another problem, > tasks from different jobs will print there logs into the same file(the > taskmanager log file). This increases the difficulty of debugging. > > As I know, Storm will spawn workers for every job. The tasks in one worker > belong to the same job. So I'm confused the purpose or advantages of Flink > design. One more question, is there any tips to solves the issues above? Or > any suggestions to implemention the similar desgin with Storm ? > > Thank you for any answers in advance! > > Regards, > Jark Wu > > >