Sounds like there's two questions here: First, from the command line, if you "mvn package" and then run the code with "java -cp targe/*jar-with-dependencies.jar com.ibm.App" do you still get the error?
Second, for quick debugging, I agree that it's a pain to wait for mvn package to finish every time a line of code changes. To avoid this when working on a new (buggy) file you can add your working -jar-with-dependencies.jar into the spark-shell using the ADD_JARS variable, then, after making a few changes to the buggy file, use ":load" from spark-shell. This will let you try out the new class without waiting for the whole mvn package. On Sat, Jun 7, 2014 at 3:19 AM, Gerard Maas <gerard.m...@gmail.com> wrote: > I think that you have two options: > > - to run your code locally, you can use local mode by using the 'local' > master like so: > new SparkConf().setMaster("local[4]") where 4 is the number of cores > assigned to the local mode. > > - to run your code remotely you need to build the jar with dependencies and > add it to your context. > new > SparkConf().setMaster("spark://uri").addJars(Array("/path/to/target/jar-with-dependencies.jar") > You will need to run maven before running your program to ensure the latest > version of your jar is built. > > -regards, Gerard. > > > > On Sat, Jun 7, 2014 at 3:10 AM, Wei Tan <w...@us.ibm.com> wrote: >> >> Hi, >> >> I am trying to write and debug Spark applications in scala-ide and >> maven, and in my code I target at a Spark instance at spark://xxx >> >> object App { >> >> >> def main(args : Array[String]) { >> println( "Hello World!" ) >> val sparkConf = new >> SparkConf().setMaster("spark://xxx:7077").setAppName("WordCount") >> >> val spark = new SparkContext(sparkConf) >> val file = spark.textFile("hdfs://xxx:9000/wcinput/pg1184.txt") >> val counts = file.flatMap(line => line.split(" ")) >> .map(word => (word, 1)) >> .reduceByKey(_ + _) >> counts.saveAsTextFile("hdfs://flex05.watson.ibm.com:9000/wcoutput") >> } >> >> } >> >> I added spark-core and hadoop-client in maven dependency so the code >> compiles fine. >> >> When I click run in Eclipse I got this error: >> >> 14/06/06 20:52:18 WARN scheduler.TaskSetManager: Loss was due to >> java.lang.ClassNotFoundException >> java.lang.ClassNotFoundException: samples.App$$anonfun$2 >> >> I googled this error and it seems that I need to package my code into a >> jar file and push it to spark nodes. But since I am debugging the code, it >> would be handy if I can quickly see results without packaging and uploading >> jars. >> >> What is the best practice of writing a spark application (like wordcount) >> and debug quickly on a remote spark instance? >> >> Thanks! >> Wei >> >> >> --------------------------------- >> Wei Tan, PhD >> Research Staff Member >> IBM T. J. Watson Research Center >> http://researcher.ibm.com/person/us-wtan > >