> On 13 Nov 2015, at 15:49, LINZ, Arnaud <al...@bouyguestelecom.fr> wrote:
> 
> Hi Robert,
>  
> Thanks, it works with 50% -- at least way past the previous crash point.
>  
> In my opinion (I lack real metrics), the part that uses the most memory is 
> the M2 mapper, instantiated once per slot.
> The most complex part is the Sink (it does use a lot of hdfs files, flushing 
> threads etc.) ; but I expect the “RichSinkFunction” to be instantiated only 
> once per slot ? I’m really surprised by that memory usage, I will try using a 
> monitoring app on the yarn jvm to understand.

In general it’s instantiated once per subtask. For your current deployment, it 
is one per slot.

– Ufuk

Reply via email to