-list.1001560.n3.nabble.com/Querying-a-parquet-file-in-s3-with-an-ec2-install-tp13737p13802.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.or
he-spark-user-list.1001560.n3.nabble.com/Querying-a-parquet-file-in-s3-with-an-ec2-install-tp13737p13791.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: user-unsubscr...@spark.ap
se files are read when you run sqlContext.parquetFile() and
the time it would take in s3 for that to happen is a lot so something
internally is timing out..
-Manu
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Querying-a-parquet-file-in-s3-with-an-ec2-instal
k://ec2-x-x-x-x.compute-1.amazonaws.com:7077
>>
>> According to the master web page each node has 6 Gig so I'm not sure why
>> I'm
>> seeing that message either. If I run with less than 2g I get the following
>> in my spark-shell:
>>
>> 14/09/08 17:47:
Caused by: java.util.concurrent.ExecutionException:
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>
> I'm not sure if this exception is from the spark-shell jvm or transferred
> over from the maste
f this exception is from the spark-shell jvm or transferred
over from the master or a worker through the master.
Any help would be greatly appreciated.
Thanks
Jim
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Querying-a-parquet-file-in-s3-with-an-