Re: OutOfMemoryError on parquet SnappyDecompressor

2016-11-21 Thread Ryan Blue
27) >> >> > >> > scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) >> >> > >> > scala.collection.Iterator$class.isEmpty(Iterator.scala:256) >> >> > >> > scala.collection.AbstractIterator.isEmpty(Iterator

Re: OutOfMemoryError on parquet SnappyDecompressor

2016-11-21 Thread Aniket
$anonfun$productToRowRdd$1.apply(basicOperators.scala:220) > > > >> > > > >> > > > >> > > > org.apache.spark.sql.execution.ExistingRdd$$anonfun$productToRowRdd$1.apply(basicOperators.scala:219) > > > >> > org.apache.spark.rdd.RDD$$anonfun$13.app

Re: OutOfMemoryError on parquet SnappyDecompressor

2016-11-21 Thread Ryan Blue
or$class.isEmpty(Iterator.scala:256) >> >> > >> > scala.collection.AbstractIterator.isEmpty(Iterator.scala:1157) >> >> > >> > >> > >> > >> > >> >> > org.apache.spark.sql.execution.ExistingRdd$$an

Re: OutOfMemoryError on parquet SnappyDecompressor

2016-11-20 Thread Aniket
Was anyone able find a solution or recommended conf for this? I am running into the same "java.lang.OutOfMemoryError: Direct buffer memory" but during snappy compression. Thanks, Aniket On Tue, Sep 23, 2014 at 7:04 PM Aaron Davidson [via Apache Spark Developers List] wrote: > This may be relat

Re: OutOfMemoryError on parquet SnappyDecompressor

2014-09-23 Thread Aaron Davidson
This may be related: https://github.com/Parquet/parquet-mr/issues/211 Perhaps if we change our configuration settings for Parquet it would get better, but the performance characteristics of Snappy are pretty bad here under some circumstances. On Tue, Sep 23, 2014 at 10:13 AM, Cody Koeninger wrot

Re: OutOfMemoryError on parquet SnappyDecompressor

2014-09-23 Thread Cody Koeninger
Cool, that's pretty much what I was thinking as far as configuration goes. Running on Mesos. Worker nodes are amazon xlarge, so 4 core / 15g. I've tried executor memory sizes as high as 6G Default hdfs block size 64m, about 25G of total data written by a job with 128 partitions. The exception c

Re: OutOfMemoryError on parquet SnappyDecompressor

2014-09-23 Thread Michael Armbrust
I actually submitted a patch to do this yesterday: https://github.com/apache/spark/pull/2493 Can you tell us more about your configuration. In particular how much memory/cores do the executors have and what does the schema of your data look like? On Tue, Sep 23, 2014 at 7:39 AM, Cody Koeninger

Re: OutOfMemoryError on parquet SnappyDecompressor

2014-09-23 Thread Cody Koeninger
So as a related question, is there any reason the settings in SQLConf aren't read from the spark context's conf? I understand why the sql conf is mutable, but it's not particularly user friendly to have most spark configuration set via e.g. defaults.conf or --properties-file, but for spark sql to