Hi everyone, We would like to start a discussion thread on "FLIP-49: Unified Memory Configuration for TaskExecutors"[1], where we describe how to improve TaskExecutor memory configurations. The FLIP document is mostly based on an early design "Memory Management and Configuration Reloaded"[2] by Stephan, with updates from follow-up discussions both online and offline.
This FLIP addresses several shortcomings of current (Flink 1.9) TaskExecutor memory configuration. - Different configuration for Streaming and Batch. - Complex and difficult configuration of RocksDB in Streaming. - Complicated, uncertain and hard to understand. Key changes to solve the problems can be summarized as follows. - Extend memory manager to also account for memory usage by state backends. - Modify how TaskExecutor memory is partitioned accounted individual memory reservations and pools. - Simplify memory configuration options and calculations logics. Please find more details in the FLIP wiki document [1]. (Please note that the early design doc [2] is out of sync, and it is appreciated to have the discussion in this mailing list thread.) Looking forward to your feedbacks. Thank you~ Xintong Song [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors [2] https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing