It's DataSet program that performs simple filtering, crossjoin and
aggregation.

I'm using Hadoop S3 FileSystem (not Emr) as far as Flink's s3 connector
doesn't work at all.

Currently I have 3 taskmanagers each 5k MB, but I tried different
configurations and all leads to the same exception

*Sent from my ZenFone
On Oct 8, 2015 12:05 PM, "Stephan Ewen" <se...@apache.org> wrote:

> Can you give us a bit more background?  What exactly is your program
> doing?
>
>   - Are you running a DataSet program, or a DataStream program?
>   - Is it one simple source that reads from S3, or are there multiple
> sources?
>   - What operations do you apply on the CSV file?
>   - Are you using Flink's S3 connector, or the Hadoop S3 file system?
>
> Greetings,
> Stephan
>
>
> On Thu, Oct 8, 2015 at 5:58 PM, KOSTIANTYN Kudriavtsev <
> kudryavtsev.konstan...@gmail.com> wrote:
>
>> Hi guys,
>>
>> I'm running FLink on EMR with 2 m3.xlarge (each 16 GB RAM) and trying to
>> process 3.8 GB CSV data from S3. I'm surprised the fact that Flink failed
>> with OutOfMemory: Java Heap space
>>
>> I tried to find the reason:
>> 1) to identify TaskManager with a command ps aux | grep TaskManager
>> 2) then build Heap histo:
>> $ jmap -histo:live 19648 | head -n23
>>  num     #instances         #bytes  class name
>> ----------------------------------------------
>>    1:        131018     3763501304  [B
>>    2:         61022        7820352  <methodKlass>
>>    3:         61022        7688456  <constMethodKlass>
>>    4:          4971        5454408  <constantPoolKlass>
>>    5:          4966        4582232  <instanceKlassKlass>
>>    6:          4169        3003104  <constantPoolCacheKlass>
>>    7:         15696        1447168  [C
>>    8:          1291         638824  [Ljava.lang.Object;
>>    9:          5318         506000  java.lang.Class
>>
>>
>> Do you have any ideas what can be the reason and how it can be fixed?
>> Is Flink uses out-of-heap memory?
>>
>>
>> Thank you,
>> Konstantin Kudryavtsev
>>
>
>

Reply via email to