Re: Submitting Spark Applications using Spark Submit

2015-06-20 Thread Raghav Shankar
ild it locally on my laptop and scp the assembly jar > to the cluster instead of building it there. The EC2 machines often take much > longer to build for some reason. Also it's cumbersome to set up proper IDE > there. > > -Andrew > > > 2015-06-19 19:11 GMT-07:00 Ra

Re: Submitting Spark Applications using Spark Submit

2015-06-19 Thread Raghav Shankar
Thanks Andrew! Is this all I have to do when using the spark ec2 script to setup a spark cluster? It seems to be getting an assembly jar that is not from my project(perhaps from a maven repo). Is there a way to make the ec2 script use the assembly jar that I created? Thanks, Raghav On Friday, Jun

Re: Implementing top() using treeReduce()

2015-06-17 Thread Raghav Shankar
Sincerely, > > DB Tsai > -- > Blog: https://www.dbtsai.com > PGP Key ID: 0xAF08DF8D > > > On Wed, Jun 17, 2015 at 5:11 PM, Raghav Shankar > wrote: >> I’ve implemented this in the suggested manner. When I bui

Re: Implementing top() using treeReduce()

2015-06-17 Thread Raghav Shankar
I’ve implemented this in the suggested manner. When I build Spark and attach the new spark-core jar to my eclipse project, I am able to use the new method. In order to conduct the experiments I need to launch my app on a cluster. I am using EC2. When I setup my master and slaves using the EC2 se

Re: Submitting Spark Applications using Spark Submit

2015-06-16 Thread Raghav Shankar
will upload this jar to YARN cluster automatically > and then you can run your application as usual. > It does not care about which version of Spark in your YARN cluster. > > 2015-06-17 10:42 GMT+08:00 Raghav Shankar >: > >> The documentation says spark.driver.userClassPath

Re: Submitting Spark Applications using Spark Submit

2015-06-16 Thread Raghav Shankar
https://spark.apache.org/docs/1.4.0/configuration.html > > On June 16, 2015, at 10:12 PM, Raghav Shankar wrote: > > I made the change so that I could implement top() using treeReduce(). A > member on here suggested I make the change in RDD.scala to accomplish that. > Also,

Re: Submitting Spark Applications using Spark Submit

2015-06-16 Thread Raghav Shankar
I made the change so that I could implement top() using treeReduce(). A member on here suggested I make the change in RDD.scala to accomplish that. Also, this is for a research project, and not for commercial use. So, any advice on how I can get the spark submit to use my custom built jars wou

Re: Different Sorting RDD methods in Apache Spark

2015-06-09 Thread Raghav Shankar
Thank you for you responses! You mention that it only works as long as the data fits on a single machine. What I am tying to do is receive the sorted contents of my dataset. For this to be possible, the entire dataset should be able to fit on a single machine. Are you saying that sorting the entir

Re: TreeReduce Functionality in Spark

2015-06-04 Thread Raghav Shankar
y, > > DB Tsai > --- > Blog: https://www.dbtsai.com > > > On Thu, Jun 4, 2015 at 10:46 AM, Raghav Shankar > wrote: > > Hey Reza, > > > > Thanks for your response! > > > > Your response clarifies some of my initi

Re: TreeReduce Functionality in Spark

2015-06-04 Thread Raghav Shankar
Hey Reza, Thanks for your response! Your response clarifies some of my initial thoughts. However, what I don't understand is how the depth of the tree is used to identify how many intermediate reducers there will be, and how many partitions are sent to the intermediate reducers. Could you provide

Re: Task result in Spark Worker Node

2015-04-17 Thread Raghav Shankar
.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) > On Apr 17, 2015, at 2:30 AM, Raghav Shankar wrote: > > Hey Imran, > > Thanks for the great explanation! This cleared up a lot of things for me. I > am actually trying to utilize some of the features withi

Re: Task result in Spark Worker Node

2015-04-17 Thread Raghav Shankar
Hey Imran, Thanks for the great explanation! This cleared up a lot of things for me. I am actually trying to utilize some of the features within Spark for a system I am developing. I am currently working on developing a subsystem that can be integrated within Spark and other Big Data solution

Re: Sending RDD object over the network

2015-04-06 Thread Raghav Shankar
Hey Akhil, Thanks for your response! No, I am not expecting to receive the values themselves. I am just trying to receive the RDD object on my second Spark application. However, I get a NPE when I try to use the object within my second program. Would you know how I can properly send the RDD objec