Hi,

could you somehow provide us a heap dump from a TM that run for a while 
(ideally, shortly before an OOME)? This would greatly help us to figure out if 
there is a classloader leak that causes the problem.

Best,
Stefan

> Am 29.11.2016 um 18:39 schrieb Konstantin Knauf 
> <konstantin.kn...@tngtech.com>:
> 
> Hi everyone, 
> 
> since upgrading to Flink 1.1.3 we observe frequent OOME Permgen Taskmanager 
> Failures. Monitoring the permgen size on one of the Taskamanagers you can see 
> that each Job (New Job and Restarts) adds a few MB, which can not be 
> collected. Eventually, the OOME happens. This happens with all our Jobs, 
> Streaming and Batch, on Yarn 2.4 as well as Stand-Alone. 
> 
> On Flink 1.0.2 this was not a problem, but I will investigate it further.
> 
> The assumption is that Flink is somehow using one of the classes, which comes 
> with our jar and by that prevents the gc of the whole class loader. Our Jars 
> do not include any flink dependencies though (compileOnly), but of course 
> many others.
> 
> Any ideas anyone? 
> 
> Cheers and thank you, 
> 
> Konstantin 
> 
> sent from my phone. Plz excuse brevity and tpyos.
> ---
> Konstantin Knauf *konstantin.kn...@tngtech.com * +49-174-3413182
> TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
> Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke

Reply via email to