Re: Out off memory when catching up

2018-03-29 Thread Lasse Nedergaard
Hi For sure I can share more info. We run on Flink 1.4.2 ( but have the same problems on 1.3.2 ) on a Aws EMR cluster. 6 taskmanagers on each m4.xlarge slave. Taskmanager heab set to 1850. We use RockStateDbBackend. we have set akka.ask.timeout to 60 s if GC should prevent heatbeat, yarn.maximum-f

Re: Out off memory when catching up

2018-03-26 Thread Timo Walther
Hi Lasse, in order to avoid OOM exception you should analyze your Flink job implementation. Are you creating a lot of objects within your Flink functions? Which state backend are you using? Maybe you can tell us a little bit more about your pipeline? Usually, there should be enough memory fo

Out off memory when catching up

2018-03-21 Thread Lasse Nedergaard
Hi. When our jobs are catching up they read with a factor 10-20 times normal rate but then we loose our task managers with OOM. We could increase the memory allocation but is there a way to figure out how high rate we can consume with the current memory and slot allocation and a way to limit t