Hi Arnaud!

Java direct-memory is tricky to debug. You can turn on the memory logging
or check the TaskManager tab in teh web dashboard - both report on direct
memory consumption.

One thing you can look for is forgetting to close streams. That means the
streams consume native resources until the Java object is Garbage
Collected, which may be quite a bit later.

Greetings,.
Stephan


On Fri, Nov 13, 2015 at 3:59 PM, Ufuk Celebi <u...@apache.org> wrote:

>
> > On 13 Nov 2015, at 15:49, LINZ, Arnaud <al...@bouyguestelecom.fr> wrote:
> >
> > Hi Robert,
> >
> > Thanks, it works with 50% -- at least way past the previous crash point.
> >
> > In my opinion (I lack real metrics), the part that uses the most memory
> is the M2 mapper, instantiated once per slot.
> > The most complex part is the Sink (it does use a lot of hdfs files,
> flushing threads etc.) ; but I expect the “RichSinkFunction” to be
> instantiated only once per slot ? I’m really surprised by that memory
> usage, I will try using a monitoring app on the yarn jvm to understand.
>
> In general it’s instantiated once per subtask. For your current
> deployment, it is one per slot.
>
> – Ufuk
>
>

Reply via email to