Re: Issues with AbstractParams

2014-11-04 Thread Sean Owen
I don't think it's anything to do with AbstractParams. The problem is MovieLensALS$Params, which is a case class without default constructor. It is not Serializable. However you can see it gets used in an RDD function: val ratings = sc.textFile(params.input).map { line => val fields = line.spli

Re: Issues with AbstractParams

2014-11-04 Thread Joseph Bradley
Hi Deb, Thanks for pointing it out! I don't know of a JIRA for it now, so it would be great if you could open one. I'm looking into the bug... Joseph On Tue, Nov 4, 2014 at 4:42 PM, Debasish Das wrote: > Hi, > > I build the master today and I was testing IR statistics on movielens > dataset (o

Re: Hadoop configuration for checkpointing

2014-11-04 Thread Cody Koeninger
Opened https://issues.apache.org/jira/browse/SPARK-4229 Sent a PR https://github.com/apache/spark/pull/3102 On Tue, Nov 4, 2014 at 11:48 AM, Marcelo Vanzin wrote: > On Tue, Nov 4, 2014 at 9:34 AM, Cody Koeninger wrote: > > 2. Is there a reason StreamingContext.getOrCreate defaults to a blank

src/main/resources/kv1.txt not found in example of HiveFromSpark

2014-11-04 Thread Qiuzhuang Lian
When running HiveFromSpark example via run-example shell, I got error, FAILED: SemanticException Line 1:23 Invalid path ''src/main/resources/kv1.txt'': No files matching path file:/home/kand/javaprojects/spark/src/main/resources/kv1.txt == END HIVE FAILURE OUTPUT =

Re: Build fails on master (f90ad5d)

2014-11-04 Thread Ted Yu
On my MacBook with 2.6 GHz Intel i7 CPU, I run zinc. Here is the tail of mvn build output: [INFO] Spark Project External Flume .. SUCCESS [7.368s] [INFO] Spark Project External ZeroMQ . SUCCESS [9.153s] [INFO] Spark Project External MQTT ...

Re: Build fails on master (f90ad5d)

2014-11-04 Thread Nicholas Chammas
Ah, found it: https://github.com/apache/spark/blob/master/docs/building-spark.md#building-with-sbt This version of the docs should be published once 1.2.0 is released. Nick On Tue, Nov 4, 2014 at 8:53 PM, Alessandro Baretta wrote: > Nicholas, > > Indeed, I was trying to use sbt to speed up the

Re: Build fails on master (f90ad5d)

2014-11-04 Thread Alessandro Baretta
Nicholas, Indeed, I was trying to use sbt to speed up the build. My initial experiments with the maven process took over 50 minutes, which on a 4-core 2014 MacBookPro seems obscene. Then again, after the failed attempt with sbt, mvn clean package took only 13 minutes, leading me to think that most

Re: Build fails on master (f90ad5d)

2014-11-04 Thread Nicholas Chammas
Zinc, I believe, is something you can install and run to speed up your Maven builds. It's not required. I get a bunch of warnings when compiling with Maven, too. Dunno if they are expected or not, but things work fine from there on. Many people do indeed use sbt. I don't know where we have docume

Re: [MLlib] Contributing Algorithm for Outlier Detection

2014-11-04 Thread slcclimber
Ashutosh, I still see a few issues. 1. On line 112 you are counting using a counter. Since this will happen in a RDD the counter will cause issues. Also that is not good functional style to use a filter function with a side effect. You could use randomSplit instead. This does not the same thing wit

Re: Build fails on master (f90ad5d)

2014-11-04 Thread Alessandro Baretta
Nicholas, Yes, I saw them, but they refer to maven, and I'm under the impression that sbt is the preferred way of building spark. Is indeed maven the "right way"? Anyway, as per your advice I ctrl-d'ed my sbt shell and have ran `mvn -DskipTests clean package`, which completed successfully. So, ind

Issues with AbstractParams

2014-11-04 Thread Debasish Das
Hi, I build the master today and I was testing IR statistics on movielens dataset (open up a PR in a bit)... Right now in the master examples.MovieLensALS, case class Params extends AbstractParam[Params] On my localhost spark, if I run as follows it fails: ./bin/spark-submit --master spark:// t

Re: Build fails on master (f90ad5d)

2014-11-04 Thread Hari Shreedharan
I have seen this on sbt sometimes. I usually do an sbt clean and that fixes it. Thanks, Hari On Tue, Nov 4, 2014 at 3:13 PM, Nicholas Chammas wrote: > FWIW, the "official" build instructions are here: > https://github.com/apache/spark#building-spark > On Tue, Nov 4, 2014 at 5:11 PM, Ted Yu wr

Re: Build fails on master (f90ad5d)

2014-11-04 Thread Nicholas Chammas
FWIW, the "official" build instructions are here: https://github.com/apache/spark#building-spark On Tue, Nov 4, 2014 at 5:11 PM, Ted Yu wrote: > I built based on this commit today and the build was successful. > > What command did you use ? > > Cheers > > On Tue, Nov 4, 2014 at 2:08 PM, Alessand

Re: Build fails on master (f90ad5d)

2014-11-04 Thread Ted Yu
I built based on this commit today and the build was successful. What command did you use ? Cheers On Tue, Nov 4, 2014 at 2:08 PM, Alessandro Baretta wrote: > Fellow Sparkers, > > I am new here and still trying to learn to crawl. Please, bear with me. > > I just pulled f90ad5d from https://git

Build fails on master (f90ad5d)

2014-11-04 Thread Alessandro Baretta
Fellow Sparkers, I am new here and still trying to learn to crawl. Please, bear with me. I just pulled f90ad5d from https://github.com/apache/spark.git and am running the compile command in the sbt shell. This is the error I'm seeing: [error] /home/alex/git/spark/mllib/src/main/scala/org/apache/

[ANN] Spark resources searchable

2014-11-04 Thread Otis Gospodnetic
Hi everyone, We've recently added indexing of all Spark resources to http://search-hadoop.com/spark . Everything is nicely searchable: * user & dev mailing lists * JIRA issues * web site * wiki * source code * javadoc. Maybe it's worth adding to http://spark.apache.org/community.html ? Enjoy!

Re: Surprising Spark SQL benchmark

2014-11-04 Thread Michael Armbrust
dev to bcc. Thanks for reaching out, Ozgun. Let's discuss if there were any missing optimizations off list. We'll make sure to report back or add any findings to the tuning guide. On Mon, Nov 3, 2014 at 3:01 PM, ozgun wrote: > Hey Patrick, > > It's Ozgun from Citus Data. We'd like to make the

Re: Hadoop configuration for checkpointing

2014-11-04 Thread Sean Owen
Let me crash this thread to suggest this *might* be related to this problem I'm trying to solve: https://issues.apache.org/jira/browse/SPARK-4196 Basically the question there is: this blank Configuration object gets made on the driver in the saveAsNewAPIHadoopFiles call, and seems to need to be se

Re: Hadoop configuration for checkpointing

2014-11-04 Thread Marcelo Vanzin
On Tue, Nov 4, 2014 at 9:34 AM, Cody Koeninger wrote: > 2. Is there a reason StreamingContext.getOrCreate defaults to a blank > hadoop configuration rather than > org.apache.spark.deploy.SparkHadoopUtil.get.conf, > which would pull values from spark config? This is probably something I overlooke

Hadoop configuration for checkpointing

2014-11-04 Thread Cody Koeninger
3 quick questions, then some background: 1. Is there a reason not to document the fact that spark.hadoop.* is copied from spark config into hadoop config? 2. Is there a reason StreamingContext.getOrCreate defaults to a blank hadoop configuration rather than org.apache.spark.deploy.SparkHadoopUt

Re: [MLlib] Contributing Algorithm for Outlier Detection

2014-11-04 Thread Ashutosh
Anant, I got rid of those increment/ decrements functions and now code is much cleaner. Please check. All your comments have been looked after. https://github.com/codeAshu/Outlier-Detection-with-AVF-Spark/blob/master/OutlierWithAVFModel.scala _Ashu [https://avatars3.githubusercontent.com/u/54