Can you paste the exception stack trace? On Thu, Oct 8, 2015 at 6:15 PM, KOSTIANTYN Kudriavtsev < kudryavtsev.konstan...@gmail.com> wrote:
> It's DataSet program that performs simple filtering, crossjoin and > aggregation. > > I'm using Hadoop S3 FileSystem (not Emr) as far as Flink's s3 connector > doesn't work at all. > > Currently I have 3 taskmanagers each 5k MB, but I tried different > configurations and all leads to the same exception > > *Sent from my ZenFone > On Oct 8, 2015 12:05 PM, "Stephan Ewen" <se...@apache.org> wrote: > >> Can you give us a bit more background? What exactly is your program >> doing? >> >> - Are you running a DataSet program, or a DataStream program? >> - Is it one simple source that reads from S3, or are there multiple >> sources? >> - What operations do you apply on the CSV file? >> - Are you using Flink's S3 connector, or the Hadoop S3 file system? >> >> Greetings, >> Stephan >> >> >> On Thu, Oct 8, 2015 at 5:58 PM, KOSTIANTYN Kudriavtsev < >> kudryavtsev.konstan...@gmail.com> wrote: >> >>> Hi guys, >>> >>> I'm running FLink on EMR with 2 m3.xlarge (each 16 GB RAM) and trying >>> to process 3.8 GB CSV data from S3. I'm surprised the fact that Flink >>> failed with OutOfMemory: Java Heap space >>> >>> I tried to find the reason: >>> 1) to identify TaskManager with a command ps aux | grep TaskManager >>> 2) then build Heap histo: >>> $ jmap -histo:live 19648 | head -n23 >>> num #instances #bytes class name >>> ---------------------------------------------- >>> 1: 131018 3763501304 [B >>> 2: 61022 7820352 <methodKlass> >>> 3: 61022 7688456 <constMethodKlass> >>> 4: 4971 5454408 <constantPoolKlass> >>> 5: 4966 4582232 <instanceKlassKlass> >>> 6: 4169 3003104 <constantPoolCacheKlass> >>> 7: 15696 1447168 [C >>> 8: 1291 638824 [Ljava.lang.Object; >>> 9: 5318 506000 java.lang.Class >>> >>> >>> Do you have any ideas what can be the reason and how it can be fixed? >>> Is Flink uses out-of-heap memory? >>> >>> >>> Thank you, >>> Konstantin Kudryavtsev >>> >> >>