*Background*: We have a setup of Flink 1.4.0. We run this flink cluster via /flink-jobmanager.sh foreground/ and /flink-taskmanager.sh foreground/ command via Marathon (which launches them as mesos jobs). So, basically, jobmanager and taskmanagers run as mesos tasks.
Now, say, we run the flink taskmanagers with taskmanager.heap.mb set to 7G in flink-conf.yaml and Marathon memory is set to 18G. Even after this, we frequently see the taskmanager containers getting killed because of OOM. The flink streaming job that we run is a basic job without any windowing or other stateful operations. Its just a job that reads from a stream, applies a bunch of transformations and writes it back via BucketingSink. It uses RocksDB as state backend. So what i am trying to understand is, how is Flink allocating taskmanager memory in containers? What would be a safe value for us to set as Marathon memory so that our taskmanagers dont keep getting killed? Are we seeing this behaviour because of starting flink taskmanagers in foreground mode as mesos task? Thanks -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/