from:"Ashish Jain"

Re: partition size for initial read

2014-10-02 Thread Ashish Jain

If you are using textFiles() to read data in, it also takes in a parameter the number of minimum partitions to create. Would that not work for you? On Oct 2, 2014 7:00 AM, "jamborta" wrote: > Hi all, > > I have been testing repartitioning to ensure that my algorithms get similar > amount of data.

Re: Confusion over how to deploy/run JAR files to a Spark Cluster

2014-10-02 Thread Ashish Jain

Hello Mark, I am no expert but I can answer some of your questions. On Oct 2, 2014 2:15 AM, "Mark Mandel" wrote: > > Hi, > > So I'm super confused about how to take my Spark code and actually deploy and run it on a cluster. > > Let's assume I'm writing in Java, and we'll take a simple example su

Re: Spark inside Eclipse

2014-10-01 Thread Ashish Jain

Hello Sanjay, This can be done, and is a very effective way to debug. 1) Compile and package your project to get a fat jar 2) In your SparkConf use setJars and give location of this jar. Also set your master here as local in SparkConf 3) Use this SparkConf when creating JavaSparkContext 4) Debug

When to start optimizing for GC?

2014-09-29 Thread Ashish Jain

Hello, I have written a standalone spark job which I run through Ooyala Job Server. The program is working correctly, now I'm looking into how to optimize it. My program without optimization took 4 hours to run. The first optimization of KyroSerializer and compiling regex pattern and reusing them

Re: Specifying classpath

2014-08-27 Thread Ashish Jain

I solved this issue by putting hbase-protobuf in Hadoop classpath, and not in the spark classpath. export HADOOP_CLASSPATH="/path/to/jar/hbase-protocol-0.98.1-cdh5.1.0.jar" On Tue, Aug 26, 2014 at 5:42 PM, Ashish Jain wrote: > Hello, > > I'm using the following

Specifying classpath

2014-08-26 Thread Ashish Jain

Hello, I'm using the following version of Spark - 1.0.0+cdh5.1.0+41 (1.cdh5.1.0.p0.27). I've tried to specify the libraries Spark uses using the following ways - 1) Adding it to spark context 2) Specifying the jar path in a) spark.executor.extraClassPath b) spark.executor.extraLibraryPath 3)

Re: partition size for initial read

Re: Confusion over how to deploy/run JAR files to a Spark Cluster

Re: Spark inside Eclipse

When to start optimizing for GC?

Re: Specifying classpath

Specifying classpath

6 matches

Site Navigation

Mail list logo

Footer information