date:20160607

Re: Dataset API agg question

2016-06-07 Thread Reynold Xin

Take a look at the implementation of typed sum/avg: https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/expressions/scalalang/typed.scala You can implement a typed max/min. On Tue, Jun 7, 2016 at 4:31 PM, Alexander Pivovarov wrote: > Ted, It does not work l

Re: Can't use UDFs with Dataframes in spark-2.0-preview scala-2.10

2016-06-07 Thread Ted Yu

Please go ahead. On Tue, Jun 7, 2016 at 4:45 PM, franklyn wrote: > Thanks for reproducing it Ted, should i make a Jira Issue?. > > > > -- > View this message in context: > http://apache-spark-developers-list.1001551.n3.nabble.com/Can-t-use-UDFs-with-Dataframes-in-spark-2-0-preview-scala-2-10-tp1

Re: Can't use UDFs with Dataframes in spark-2.0-preview scala-2.10

2016-06-07 Thread franklyn

Thanks for reproducing it Ted, should i make a Jira Issue?. -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Can-t-use-UDFs-with-Dataframes-in-spark-2-0-preview-scala-2-10-tp17845p17852.html Sent from the Apache Spark Developers List mailing list archiv

Re: Dataset API agg question

2016-06-07 Thread Alexander Pivovarov

Ted, It does not work like that you have to .map(toAB).toDS On Tue, Jun 7, 2016 at 4:07 PM, Ted Yu wrote: > Have you tried the following ? > > Seq(1->2, 1->5, 3->6).toDS("a", "b") > > then you can refer to columns by name. > > FYI > > > On Tue, Jun 7, 2016 at 3:58 PM, Alexander Pivovarov > wro

Re: Can't use UDFs with Dataframes in spark-2.0-preview scala-2.10

2016-06-07 Thread Ted Yu

I built with Scala 2.10 >>> df.select(add_one(df.a).alias('incremented')).collect() The above just hung. On Tue, Jun 7, 2016 at 3:31 PM, franklyn wrote: > Thanks Ted !. > > I'm using > > https://github.com/apache/spark/commit/8f5a04b6299e3a47aca13cbb40e72344c0114860 > and building with scala-2

Re: Dataset API agg question

2016-06-07 Thread Ted Yu

Have you tried the following ? Seq(1->2, 1->5, 3->6).toDS("a", "b") then you can refer to columns by name. FYI On Tue, Jun 7, 2016 at 3:58 PM, Alexander Pivovarov wrote: > I'm trying to switch from RDD API to Dataset API > My question is about reduceByKey method > > e.g. in the following exa

Dataset API agg question

2016-06-07 Thread Alexander Pivovarov

I'm trying to switch from RDD API to Dataset API My question is about reduceByKey method e.g. in the following example I'm trying to rewrite sc.parallelize(Seq(1->2, 1->5, 3->6)).reduceByKey(math.max).take(10) using DS API. That is what I have so far: Seq(1->2, 1->5, 3->6).toDS.groupBy(_._1).ag

Re: Can't use UDFs with Dataframes in spark-2.0-preview scala-2.10

2016-06-07 Thread franklyn

Thanks Ted !. I'm using https://github.com/apache/spark/commit/8f5a04b6299e3a47aca13cbb40e72344c0114860 and building with scala-2.10 I can confirm that it works with scala-2.11 -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Can-t-use-UDFs-with-Dataf

Re: Can't use UDFs with Dataframes in spark-2.0-preview scala-2.10

2016-06-07 Thread Ted Yu

With commit 200f01c8fb15680b5630fbd122d44f9b1d096e02 using Scala 2.11: Using Python version 2.7.9 (default, Apr 29 2016 10:48:06) SparkSession available as 'spark'. >>> from pyspark.sql import SparkSession >>> from pyspark.sql.types import IntegerType, StructField, StructType >>> from pyspark.sql.

Can't use UDFs with Dataframes in spark-2.0-preview scala-2.10

2016-06-07 Thread Franklyn D'souza

I've built spark-2.0-preview (8f5a04b) with scala-2.10 using the following > > > ./dev/change-version-to-2.10.sh > ./dev/make-distribution.sh -DskipTests -Dzookeeper.version=3.4.5 > -Dcurator.version=2.4.0 -Dscala-2.10 -Phadoop-2.6 -Pyarn -Phive and then ran the following code in a pyspark shell

Re: Spark 2.0.0-preview artifacts still not available in Maven

2016-06-07 Thread Shivaram Venkataraman

As far as I know the process is just to copy docs/_site from the build to the appropriate location in the SVN repo (i.e. site/docs/2.0.0-preview). Thanks Shivaram On Tue, Jun 7, 2016 at 8:14 AM, Sean Owen wrote: > As a stop-gap, I can edit that page to have a small section about > preview releas

Standalone Cluster Mode: how does spark allocate spark.executor.cores?

2016-06-07 Thread ElfoLiNk

Hi, I'm searching for how and where spark allocates cores per executor in the source code. Is it possible to control programmaticaly allocated cores in standalone cluster mode? Regards, Matteo -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Standalone

Re: Spark 2.0.0-preview artifacts still not available in Maven

2016-06-07 Thread Tom Graves

Thanks Sean, you were right, hard refresh made it show up. Seems like we should at least link to the preview docs from http://spark.apache.org/documentation.html. Tom On Tuesday, June 7, 2016 10:04 AM, Sean Owen wrote: It's there (refresh maybe?). See the end of the downloads dropdown.

Re: Spark 2.0.0-preview artifacts still not available in Maven

2016-06-07 Thread Sean Owen

As a stop-gap, I can edit that page to have a small section about preview releases and point to the nightly docs. Not sure who has the power to push 2.0.0-preview to site/docs, but, if that's done then we can symlink "preview" in that dir to it and be done, and update this section about preview do

Re: Spark 2.0.0-preview artifacts still not available in Maven

2016-06-07 Thread Sean Owen

It's there (refresh maybe?). See the end of the downloads dropdown. For the moment you can see the docs in the nightly docs build: https://home.apache.org/~pwendell/spark-nightly/spark-branch-2.0-docs/latest/ I don't know, what's the best way to put this into the main site? under a /preview root?

Re: Spark 2.0.0-preview artifacts still not available in Maven

2016-06-07 Thread Tom Graves

I just checked and I don't see the 2.0 preview release at all anymore on .http://spark.apache.org/downloads.html, is it in transition? The only place I can see it is at http://spark.apache.org/news/spark-2.0.0-preview.html I would like to see docs there too. My opinion is it should be as eas

Re: Welcoming Yanbo Liang as a committer

2016-06-07 Thread Xiangrui Meng

Congrats!! On Mon, Jun 6, 2016, 8:12 AM Gayathri Murali wrote: > Congratulations Yanbo Liang! Well deserved. > > > On Sun, Jun 5, 2016 at 7:10 PM, Shixiong(Ryan) Zhu < > shixi...@databricks.com> wrote: > >> Congrats, Yanbo! >> >> On Sun, Jun 5, 2016 at 6:25 PM, Liwei Lin wrote: >> >>> Congratul

streaming JobScheduler and error handling confusing behavior

2016-06-07 Thread Krot Viacheslav

Hi, I don't know if it is a bug or a feature, but one thing in streaming error handling seems confusing to me - I create streaming context, start and call #awaitTermination like this: try { ssc.awaitTermination(); } catch (Exception e) { LoggerFactory.getLogger(getClass()).error("Job failed. S

Re: Dataset API agg question

Re: Can't use UDFs with Dataframes in spark-2.0-preview scala-2.10

Re: Can't use UDFs with Dataframes in spark-2.0-preview scala-2.10

Re: Dataset API agg question

Re: Can't use UDFs with Dataframes in spark-2.0-preview scala-2.10

Re: Dataset API agg question

Dataset API agg question

Re: Can't use UDFs with Dataframes in spark-2.0-preview scala-2.10

Re: Can't use UDFs with Dataframes in spark-2.0-preview scala-2.10

Can't use UDFs with Dataframes in spark-2.0-preview scala-2.10

Re: Spark 2.0.0-preview artifacts still not available in Maven

Standalone Cluster Mode: how does spark allocate spark.executor.cores?

Re: Spark 2.0.0-preview artifacts still not available in Maven

Re: Spark 2.0.0-preview artifacts still not available in Maven

Re: Spark 2.0.0-preview artifacts still not available in Maven

Re: Spark 2.0.0-preview artifacts still not available in Maven

Re: Welcoming Yanbo Liang as a committer

streaming JobScheduler and error handling confusing behavior

18 matches

Site Navigation

Mail list logo

Footer information