Hi community ~ I think this title should be quite interesting. The idea is to reduce the workload of the JobManager and make the SessionCluster [2] more stable in the process of running jobs. I designed a plan for splitting the JobManager on FLIP-257 [1]: https://cwiki.apache.org/confluence/display/FLINK/FLIP-257+Flink+JobManager+Process+Split <https://cwiki.apache.org/confluence/display/FLINK/FLIP-257+Flink+JobManager+JobMaster+Thread+Split+to+Process>
This proposal proposes a splitting scheme for the current process and a new process implementation idea that is compatible with the original process model: splitting the internal JobMaster component of the JobManager, and controlling whether to enable this new process through a parameter In the split scheme, when the user configures, the JobMaster will make it run as an independent service, reducing the workload of the JobManager. By implementing a new Dispatcher to communicate and interact with a single split JobMaster or multiple JobMasters, to achieve job management The main features that it provides is: - After the user submits the job, the JobMaster thread was split into other processes to run in the past. They no longer run in the JobManager, but in other processes. - Users can deploy multiple components mentioned above, which run multiple JobMaster threads, thereby reducing the workload of the JobManager Some of the challenging use cases that these features solve are: - Compatible with the original job running mode (run JobMaster Thread on JobManager) - Implement a new Dispatcher that forwards client operations related to jobs I would love to hear and address your thoughts and feedback , and if possible drive a FLIP-257 ! [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-257+Flink+JobManager+Process+Split <https://cwiki.apache.org/confluence/display/FLINK/FLIP-257+Flink+JobManager+JobMaster+Thread+Split+to+Process> [2] https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/overview/#session-mode -- Have a nice day ~ ConradJam