Hi everyone, Currently Flink accepts jobs from multiple clients and executes them concurrently if the resource limits are not exceeded. However, the multi-user support is very poor. We don't support queuing of jobs and concurrent jobs have to share resources in a nice way. Otherwise, jobs will fail.
Using YARN, we circumvent these problems because it provides a proper user and session management. I'm wondering now, should we get rid of the pseudo multi-user mode and just support one user per Flink cluster instance? Best, Max PS: This question came up when I was working on a pull request to support backtracking intermediate results. I need to hold a copy of the full previous execution graph to resume from old results. With multiple users, we have to build in some kind of session management to archive old execution graphs. Otherwise, they will consume too much memory in the job manager.