While submitting the job, you can use --jars, --driver-classpath etc
configurations to add the jar. Apart from that if you are running the
job as a standalone application, then you can use the sc.addJar option
to add the jar (which will ship this jar into all the executors)
Regards,
Anish
On 8/
Hi,
If you are decreasing the number of partitions in this RDD, consider
using coalesce, which can avoid performing a shuffle.
However, if you're doing a drastic coalesce, e.g. to numPartitions =
1, this may result in your computation taking place on fewer nodes
than you like (e.g. one node in th
Hi, I had the same problem.
One option (starting with Spark 1.2, which is currently in preview) is to
use the Avro library for Spark SQL.
Other is using Kryo Serialization.
by default spark uses Java Serialization, you can specify kryo
serialization while creating spark context.
val conf = new S