Hi
Im trying run some spark code on a cluster but I keep running into a
"java.io.StreamCorruptedException: invalid type code: AC" error. My task
involves analyzing ~50GB of data (some operations involve sorting) then
writing them out to a JSON file. Im running the analysis on each of the
data's ~1
I havent been able to set the cores with that option in Spark 1.0.0 either.
To work around that, setting the environment variable:
SPARK_JAVA_OPTS="-Dspark.cores.max=" seems to do the trick.
Matt Kielo
Data Scientist
Oculus Info Inc.
On Tue, Jun 3, 2014 at 11:15 AM, Marek Wiewiorka
wr
Hello,
I currently have a task always failing with "java.io.FileNotFoundException:
[...]/shuffle_0_257_2155 (Too many open files)" when I run sorting
operations such as distinct, sortByKey, or reduceByKey on a large number of
partitions.
Im working with 365 GB of data which is being split into 59