As Fabian suggested, YARN is a good way to go for isolation (it actually isolates more than a JVM, which is very nice).
Here are some additional things you can do: - For isolation between parallel tasks (within a job), start your YARN job such that each TaskManager has one slot, and start many TaskManagers. That is a bit less efficient (but not much) than fewer TaskManagers with more slots. (*) - If you need to isolate successor tasks in a job against predecessor tasks, you can select "batch" execution mode. By default, the system uses "pipelined" execution mode. In a MapReduce case, this means that mappers and reducers run concurrently. With "batch" mode, reducers run only after all mappers finished. Greetings, Stephan (*) The reason why multiple slots in one TaskManager are more efficient is that TaskManagers multiplex multiple data exchanges of a shuffle through a TCP connection, reducing per-exchange overhead and usually increasing throughput. On Thu, Jul 30, 2015 at 12:10 PM, Fabian Hueske <fhue...@gmail.com> wrote: > Hi, > > it is currently not possible to isolate tasks that consume a lot of JVM > heap memory and schedule them to a specific slot (or TaskManager). > If you operate in a YARN setup, you can isolate different jobs from each > other by starting a new YARN session for each job, but tasks within the > same job cannot be isolated from each other right now. > > Cheers, Fabian > > 2015-07-30 4:02 GMT+02:00 wangzhijiang999 <wangzhijiang...@aliyun.com>: > >> As I know, flink uses thread model in TaskManager, that means one >> taskmanager process may run many different operator threads,and these >> threads will compete the memory of the process. I know that flink has >> memoryManage component in each taskManager, and it will control the >> localBufferPool of InputGate, ResultPartition for each task,but if UDF >> consume much memory, it will use jvm heap memory, so it can not be >> controlled by flink. If I use flink as common platform, some users will >> consume much memory in UDF, and it may influence other threads in the >> process, especially for OOM. I know that it has sharedslot or isolated >> slot properties , but it just limit the task schedule in one taskmanager, >> can i schedule task in separate taskmanger if i consume much memory and >> donot want to influence other tasks. Or are there any suggestions for the >> issue of thread model. As I know spark is also thread model, but hadoop2 >> use process model. >> >> >> Thank you for any suggestions in advance! >> > >