[MLlib] Extensibility of MLlib classes (Word2VecModel etc.)

2015-09-09 Thread Maandy
Hey, I'm trying to implement doc2vec (http://cs.stanford.edu/~quocle/paragraph_vector.pdf, mainly for sport/research purpose due to all it's limitations so I would probably not even try to PR it into MLlib itself) but to do that it would be highly useful to have access to MLlib's Word2VecModel cla

Did the 1.5 release complete?

2015-09-09 Thread Sean Owen
I saw the end of the RC3 vote: https://mail-archives.apache.org/mod_mbox/spark-dev/201509.mbox/%3CCAPh_B%3DbQWf_vVuPs_eRpvnNSj8fbULX4kULnbs6MCAA10ZQ9eQ%40mail.gmail.com%3E but there are no artifacts for it in Maven? http://search.maven.org/#search%7Cgav%7C1%7Cg%3A%22org.apache.spark%22%20AND%20a%3

[ANNOUNCE] Announcing Spark 1.5.0

2015-09-09 Thread Reynold Xin
Hi All, Spark 1.5.0 is the sixth release on the 1.x line. This release represents 1400+ patches from 230+ contributors and 80+ institutions. To download Spark 1.5.0 visit the downloads page. A huge thanks go to all of the individuals and organizations involved in development and testing of this r

Re: Did the 1.5 release complete?

2015-09-09 Thread Reynold Xin
Dev/user announcement was made just now. For Maven, I did publish it this afternoon (so it's been a few hours). If it is still not there tomorrow morning, I will look into it. On Wed, Sep 9, 2015 at 2:42 AM, Sean Owen wrote: > I saw the end of the RC3 vote: > > https://mail-archives.apache.or

Re: [ANNOUNCE] Announcing Spark 1.5.0

2015-09-09 Thread Yu Ishikawa
Great work, everyone! - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/ANNOUNCE-Announcing-Spark-1-5-0-tp14013p14015.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. --

Re: [ANNOUNCE] Announcing Spark 1.5.0

2015-09-09 Thread Dimitris Kouzis - Loukas
Yeii! On Wed, Sep 9, 2015 at 2:25 PM, Yu Ishikawa wrote: > Great work, everyone! > > > > - > -- Yu Ishikawa > -- > View this message in context: > http://apache-spark-developers-list.1001551.n3.nabble.com/ANNOUNCE-Announcing-Spark-1-5-0-tp14013p14015.html > Sent from the Apache Spark Develop

looking for a technical reviewer to review a book on Spark

2015-09-09 Thread Mohammed Guller
Hi Spark developers, I am writing a book on Spark. The publisher of the book is looking for a technical reviewer. You will be compensated for your time. The publisher will pay a flat rate per page for the review. I spoke with Matei Zaharia about this and he suggested that I send an email to th

Re: looking for a technical reviewer to review a book on Spark

2015-09-09 Thread Gurumurthy Yeleswarapu
Hi Mohammad: I'm interested. ThanksGuru Yeleswarapu From: Mohammed Guller To: "dev@spark.apache.org" Sent: Wednesday, September 9, 2015 8:36 AM Subject: looking for a technical reviewer to review a book on Spark Hi Spark developers,   I am writing a book on Spark. The publisher o

Re: looking for a technical reviewer to review a book on Spark

2015-09-09 Thread Gurumurthy Yeleswarapu
My Apologies for broadcast! That email was meant for Mohammad. From: Gurumurthy Yeleswarapu To: Mohammed Guller ; "dev@spark.apache.org" Sent: Wednesday, September 9, 2015 8:50 AM Subject: Re: looking for a technical reviewer to review a book on Spark Hi Mohammad: I'm interested. T

RE: (Spark SQL) partition-scoped UDF

2015-09-09 Thread Eron Wright
Follow-up: solved this problem by overriding the model's `transform` method, and using `mapPartitions` to produce a new DataFrame rather than using `udf`. Source code:https://github.com/deeplearning4j/deeplearning4j/blob/135d3b25b96c21349abf488a44f59bb37a2a5930/deeplearning4j-scaleout/spark/d

Re: [ANNOUNCE] Announcing Spark 1.5.0

2015-09-09 Thread Jerry Lam
Hi Spark Developers, I'm eager to try it out! However, I got problems in resolving dependencies: [warn] [NOT FOUND ] org.apache.spark#spark-core_2.10;1.5.0!spark-core_2.10.jar (0ms) [warn] jcenter: tried When the package will be available? Best Regards, Jerry On Wed, Sep 9, 2015 at 9:30

Re: [ANNOUNCE] Announcing Spark 1.5.0

2015-09-09 Thread Ted Yu
Jerry: I just tried building hbase-spark module with 1.5.0 and I see: ls -l ~/.m2/repository/org/apache/spark/spark-core_2.10/1.5.0 total 21712 -rw-r--r-- 1 tyu staff 196 Sep 9 09:37 _maven.repositories -rw-r--r-- 1 tyu staff 11081542 Sep 9 09:37 spark-core_2.10-1.5.0.jar -rw-r--r--

Re: [ANNOUNCE] Announcing Spark 1.5.0

2015-09-09 Thread andy petrella
You can try it out really quickly by "building" a Spark Notebook from http://spark-notebook.io/. Just choose the master branch and 1.5.0, a correct hadoop version (default to 2.2.0 though) and there you go :-) On Wed, Sep 9, 2015 at 6:39 PM Ted Yu wrote: > Jerry: > I just tried building hbase-

Re: Code generation for GPU

2015-09-09 Thread lonikar
I am already looking at the dataframes APIs and the implementation. In fact, the columnar representation https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/columnar/ColumnType.scala is what gave me the idea of my talk proposal. It is ideally suited for computat

Spark 1.5: How to trigger expression execution through UnsafeRow/TungstenProject

2015-09-09 Thread lonikar
The tungsten, cogegen etc options are enabled by default. But I am not able to get the execution through the UnsafeRow/TungstenProject. It still executes using InternalRow/Project. I see this in the SparkStrategies.scala: If unsafe mode is enabled and we support these data types in Unsafe, use the

Re: Spark 1.5: How to trigger expression execution through UnsafeRow/TungstenProject

2015-09-09 Thread Ted Yu
Here is the example from Reynold ( http://search-hadoop.com/m/q3RTtfvs1P1YDK8d) : scala> val data = sc.parallelize(1 to size, 5).map(x => (util.Random.nextInt(size / repetitions),util.Random.nextDouble)).toDF("key", "value") data: org.apache.spark.sql.DataFrame = [key: int, value: double] scala>

Re: Deserializing JSON into Scala objects in Java code

2015-09-09 Thread Kevin Chen
Marcelo and Christopher, Thanks for your help! The problem turned out to arise from a different part of the code (we have multiple ObjectMappers), but because I am not very familiar with Jackson I had thought there was a problem with the Scala module. Thank you again, Kevin From: Christopher C

[SparkSQL]Could not alter table in Spark 1.5 use HiveContext

2015-09-09 Thread StanZhai
After upgrade spark from 1.4.1 to 1.5.0, I encountered the following exception when use alter table statement in HiveContext: The sql is: ALTER TABLE a RENAME TO b The exception is: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. Invalid