Re: Container running beyond physical memory limits when processing DataStream

2016-08-04 Thread Stephan Ewen
Hi! The JVM is allowed 1448m of memory, and the JVM should never use more heap than that. The fact that the process is using more than 2GB memory in total means that some libraries are allocating memory outside the heap. You can activate the memory logger to diagnose that: https://ci.apache.org/

Re: Container running beyond physical memory limits when processing DataStream

2016-08-03 Thread Jack Huang
Hi Max, Changing yarn-heap-cutoff-ratio works seem to suffice for the time being. Thanks for your help. Regards, Jack On Tue, Aug 2, 2016 at 11:11 AM, Jack Huang wrote: > Hi Max, > > Is there a way to limit the JVM memory usage (something like the -Xmx > flag) for the task manager so that it w

Re: Container running beyond physical memory limits when processing DataStream

2016-08-02 Thread Jack Huang
Hi Max, Is there a way to limit the JVM memory usage (something like the -Xmx flag) for the task manager so that it won't go over the YARN limit but will just run GC until there is memory to use? Trying to allocate "enough" memory for this stream task is not ideal because I could have indefinitely

Re: Container running beyond physical memory limits when processing DataStream

2016-08-02 Thread Maximilian Michels
Your job creates a lot of String objects which need to be garbage collected. It could be that the JVM is not fast enough and Yarn kills the JVM for consuming too much memory. You can try two things: 1) Give the task manager more memory 2) Increase the Yarn heap cutoff ratio (e.g yarn.heap-cutoff-

Re: Container running beyond physical memory limits when processing DataStream

2016-07-29 Thread Maximilian Michels
Hi Jack, Considering the type of job you're running, you shouldn't run out of memory. Could it be that the events are quite large strings? It could be that the TextOutputFormat doesn't write to disk fast enough and accumulates memory. Actually, it doesn't perform regular flushing which could be an

Container running beyond physical memory limits when processing DataStream

2016-07-28 Thread Jack Huang
Hi all, I am running a test Flink streaming task under YARN. It reads messages from a Kafka topic and writes them to local file system. object PricerEvent { def main(args:Array[String]) { val kafkaProp = new Properties() kafkaProp.setProperty("bootstrap.servers", "localhost:66