Re: Spark 1.0.0 rc3
SPARK_HADOOP_VERSION=2.3.0 sbt/sbt assembly and copy the generated jar to lib/ directory of my application, it seems that sbt cannot find the dependencies in the jar? but everything works with the pre-built jar files downloaded from the link provided by Patrick Best, -- Nan Zhu On Thursday, May 1, 2014 at 11:16 PM, Madhu wrote: > I'm guessing EC2 support is not there yet? > > I was able to build using the binary download on both Windows 7 and RHEL 6 > without issues. > I tried to create an EC2 cluster, but saw this: > > ~/spark-ec2 > Initializing spark > ~ ~/spark-ec2 > ERROR: Unknown Spark version > Initializing shark > ~ ~/spark-ec2 ~/spark-ec2 > ERROR: Unknown Shark version > > The spark dir on the EC2 master has only a conf dir, so it didn't deploy > properly. > > > > -- > View this message in context: > http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-1-0-0-rc3-tp6427p6456.html > Sent from the Apache Spark Developers List mailing list archive at Nabble.com > (http://Nabble.com). > >
Apache Spark running out of the spark shell
Hi, I have written a code that works just about fine in the spark shell on EC2. The ec2 script helped me configure my master and worker nodes. Now I want to run the scala-spark code out side the interactive shell. How do I go about doing it. I was referring to the instructions mentioned here: https://spark.apache.org/docs/0.9.1/quick-start.html But this is confusing because it mentions about a simple project jar file which I am not sure how to generate. I only have the file that runs directly on my spark shell. Any easy intruction to get this quickly running as a job? Thanks AJ -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Apache-Spark-running-out-of-the-spark-shell-tp6459.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
Re: Apache Spark running out of the spark shell
Hi AJ, You might find this helpful - http://blog.cloudera.com/blog/2014/04/how-to-run-a-simple-apache-spark-app-in-cdh-5/ -Sandy On Sat, May 3, 2014 at 8:42 AM, Ajay Nair wrote: > Hi, > > I have written a code that works just about fine in the spark shell on EC2. > The ec2 script helped me configure my master and worker nodes. Now I want > to > run the scala-spark code out side the interactive shell. How do I go about > doing it. > > I was referring to the instructions mentioned here: > https://spark.apache.org/docs/0.9.1/quick-start.html > > But this is confusing because it mentions about a simple project jar file > which I am not sure how to generate. I only have the file that runs > directly > on my spark shell. Any easy intruction to get this quickly running as a > job? > > Thanks > AJ > > > > -- > View this message in context: > http://apache-spark-developers-list.1001551.n3.nabble.com/Apache-Spark-running-out-of-the-spark-shell-tp6459.html > Sent from the Apache Spark Developers List mailing list archive at > Nabble.com. >
Re: Apache Spark running out of the spark shell
Hey AJ, I created a little sample app using the spark's quick start. Have a look here. Assuming you used scala, using sbt is good for running your application in standalone mode. The configuration file which is "simple.sbt" in my repo, holds all the dependencies needed to build your app. Hope this helps! Le 2014-05-03 à 11:42, Ajay Nair a écrit : > Hi, > > I have written a code that works just about fine in the spark shell on EC2. > The ec2 script helped me configure my master and worker nodes. Now I want to > run the scala-spark code out side the interactive shell. How do I go about > doing it. > > I was referring to the instructions mentioned here: > https://spark.apache.org/docs/0.9.1/quick-start.html > > But this is confusing because it mentions about a simple project jar file > which I am not sure how to generate. I only have the file that runs directly > on my spark shell. Any easy intruction to get this quickly running as a job? > > Thanks > AJ > > > > -- > View this message in context: > http://apache-spark-developers-list.1001551.n3.nabble.com/Apache-Spark-running-out-of-the-spark-shell-tp6459.html > Sent from the Apache Spark Developers List mailing list archive at Nabble.com. > Nicolas Garneau ngarn...@ngarneau.com
Re: Apache Spark running out of the spark shell
Thank you for the reply. Have you posted a link from where I follow the steps ? -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Apache-Spark-running-out-of-the-spark-shell-tp6459p6462.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
Re: Apache Spark running out of the spark shell
Sorry, the link went wrong. I meant here: https://github.com/ngarneau/spark-standalone Le 2014-05-03 à 13:23, Nicolas Garneau a écrit : > Hey AJ, > > I created a little sample app using the spark's quick start. > Have a look here. > Assuming you used scala, using sbt is good for running your application in > standalone mode. > The configuration file which is "simple.sbt" in my repo, holds all the > dependencies needed to build your app. > > Hope this helps! > > Le 2014-05-03 à 11:42, Ajay Nair a écrit : > >> Hi, >> >> I have written a code that works just about fine in the spark shell on EC2. >> The ec2 script helped me configure my master and worker nodes. Now I want to >> run the scala-spark code out side the interactive shell. How do I go about >> doing it. >> >> I was referring to the instructions mentioned here: >> https://spark.apache.org/docs/0.9.1/quick-start.html >> >> But this is confusing because it mentions about a simple project jar file >> which I am not sure how to generate. I only have the file that runs directly >> on my spark shell. Any easy intruction to get this quickly running as a job? >> >> Thanks >> AJ >> >> >> >> -- >> View this message in context: >> http://apache-spark-developers-list.1001551.n3.nabble.com/Apache-Spark-running-out-of-the-spark-shell-tp6459.html >> Sent from the Apache Spark Developers List mailing list archive at >> Nabble.com. >> > > Nicolas Garneau > ngarn...@ngarneau.com > Nicolas Garneau 418.569.3097 ngarn...@ngarneau.com
Re: Apache Spark running out of the spark shell
Thank you. Let me try this quickly ! -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Apache-Spark-running-out-of-the-spark-shell-tp6459p6463.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
Re: Apache Spark running out of the spark shell
Quick question, where should I place your folder. Inside the spark directory. My Spark directory is in /root/spark So currently I tried pulling your github code in /root/spark/spark-examples and modified my home spark directory in the scala code. I copied the sbt folder within the spark-examples folder. But when I try running this command $root/spark/spark-examples: sbt/sbt package awk: cmd. line:1: fatal: cannot open file `./project/build.properties' for reading (No such file or directory) Launching sbt from sbt/sbt-launch-.jar Error: Invalid or corrupt jarfile sbt/sbt-launch-.jar However the sbt package runs fines (Expectedly) when i run it from /root/spark folder. Anything I am doing wrong here? -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Apache-Spark-running-out-of-the-spark-shell-tp6459p6465.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
Re: Apache Spark running out of the spark shell
Hey AJ, As I can see your path when running sbt is: > $root/spark/spark-examples: sbt/sbt package You should be within the app's folder that contains the simple.sbt, which is spark-standalone/; > $root/spark/spark-examples/spark-standalone: sbt/sbt package > $root/spark/spark-examples/spark-standalone: sbt/sbt run Don't forget to move the sbt folder within your app's directory. That being said, I think you can install sbt globally on your system so you'll be able to run the sbt command everywhere on your PC. It'll be useful when creating multiple apps. For example, the way I'm building it from A to Z: $ git clone https://github.com/ngarneau/spark-standalone.git $ cd spark-standalone -- change the path of spark's home dir $ sbt package (assuming sbt is installed globally) $ sbt run (assuming sbt is installed globally) Hope this helps! Le 2014-05-03 à 13:38, Ajay Nair a écrit : > Quick question, where should I place your folder. Inside the spark directory. > My Spark directory is in /root/spark > So currently I tried pulling your github code in /root/spark/spark-examples > and modified my home spark directory in the scala code. > I copied the sbt folder within the spark-examples folder. But when I try > running this command > > $root/spark/spark-examples: sbt/sbt package > > awk: cmd. line:1: fatal: cannot open file `./project/build.properties' for > reading (No such file or directory) > Launching sbt from sbt/sbt-launch-.jar > Error: Invalid or corrupt jarfile sbt/sbt-launch-.jar > > > However the sbt package runs fines (Expectedly) when i run it from > /root/spark folder. > > Anything I am doing wrong here? > > > > > > -- > View this message in context: > http://apache-spark-developers-list.1001551.n3.nabble.com/Apache-Spark-running-out-of-the-spark-shell-tp6459p6465.html > Sent from the Apache Spark Developers List mailing list archive at Nabble.com. > Nicolas Garneau ngarn...@ngarneau.com
Re: Mailing list
Hi Nicolas, Good catches on these things. > Your website seems a little bit incomplete. I have found this page [1] with > list the two main mailing lists, users and dev. But I see a reference to a > mailing list about "issues" which tracks the sparks issues when it was hosted > at Atlassian. I guess it has moved ? where ? > And is there any mailing about the commits ? Good catch, this was an old link and I’ve fixed it now. I also added the one for commits. > Also, I found it weird that there is no page that is referencing the true > code source, the git at the ASF, I only found references to the git at github. The GitHub repo is actually a mirror managed by the ASF, but the “git tag” link at http://spark.apache.org/downloads.html also points to the source repo. The problem is that our contribution process is through GitHub so it’s easier to point people to something that they can use to contribute. > I am also interested in your workflow, because Ant is moving from svn to git > and we're still a little bit in the grey about the workflow. I am thus > intrigued how do you work with github pull requests. Take a look at https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark and https://cwiki.apache.org/confluence/display/SPARK/Reviewing+and+Merging+Patches to see our contribution process. In a nutshell, it works as follows: - Anyone can make a patch by forking the GitHub repo and sending a pull request (GitHub’s internal patch mechanism) - Committers review the patch and ask for changes; contributors can push additional changes into their pull request to respond - When the patch looks good, we use a script to merge it into the source Apache repo; this also squashes the changes into one commit, making the Git history sane and facilitating reverts, cherry-picks into other branches, etc. Note by the way that using GitHub is not at all necessary for using Git. We happened to do our development on GitHub before moving to the ASF, and all our developers were used to its interface, so we stuck with it. It definitely beats attaching patches on JIRA but it may not be the first step you want to take in moving to Git. Matei > > Nicolas > > [1] https://spark.apache.org/community.html >