date:20160705

Re: Dataset and Aggregator API pain points

2016-07-05 Thread Reynold Xin

See https://issues.apache.org/jira/browse/SPARK-16390 On Sat, Jul 2, 2016 at 6:35 PM, Reynold Xin wrote: > Thanks, Koert, for the great email. They are all great points. > > We should probably create an umbrella JIRA for easier tracking. > > > On Saturday, July 2, 2016, Koert Kuipers wrote: > >

[VOTE] Release Apache Spark 2.0.0 (RC2)

2016-07-05 Thread Reynold Xin

Please vote on releasing the following candidate as Apache Spark version 2.0.0. The vote is open until Friday, July 8, 2016 at 23:00 PDT and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 2.0.0 [ ] -1 Do not release this package because ...

Re: Why's ds.foreachPartition(println) not possible?

2016-07-05 Thread Shixiong(Ryan) Zhu

I asked this question in Scala user group two years ago: https://groups.google.com/forum/#!topic/scala-user/W4f0d8xK1nk Take a look if you are interested in. On Tue, Jul 5, 2016 at 1:31 PM, Reynold Xin wrote: > You can file it here: https://issues.scala-lang.org/secure/Dashboard.jspa > > Perhap

Re: Why's ds.foreachPartition(println) not possible?

2016-07-05 Thread Reynold Xin

You can file it here: https://issues.scala-lang.org/secure/Dashboard.jspa Perhaps "bug" is not the right word, but "limitation". println accepts a single argument of type Any and returns Unit, and it appears that Scala fails to infer the correct overloaded method in this case. def println() = C

Re: Why's ds.foreachPartition(println) not possible?

2016-07-05 Thread Cody Koeninger

I don't think that's a scala compiler bug. println is a valid expression that returns unit. Unit is not a single-argument function, and does not match any of the overloads of foreachPartition You may be used to a conversion taking place when println is passed to method expecting a function, but

Re: spark git commit: [SPARK-15204][SQL] improve nullability inference for Aggregator

2016-07-05 Thread Reynold Xin

Jacek, This is definitely not necessary, but I wouldn't waste cycles "fixing" things like this when they have virtually zero impact. Perhaps next time we update this code we can "fix" it. Also can you comment on the pull request directly? On Tue, Jul 5, 2016 at 1:07 PM, Jacek Laskowski wrote:

Re: spark git commit: [SPARK-15204][SQL] improve nullability inference for Aggregator

2016-07-05 Thread Koert Kuipers

oh you mean instead of: assert(ds3.select(NameAgg.toColumn).schema.head.nullable === true) just do: assert(ds3.select(NameAgg.toColumn).schema.head.nullable) i did mostly === true because i also had === false, and i liked the symmetry, but sure this can be fixed if its not the norm On Tue, Jul 5,

Re: spark git commit: [SPARK-15204][SQL] improve nullability inference for Aggregator

2016-07-05 Thread Jacek Laskowski

On Mon, Jul 4, 2016 at 6:14 AM, wrote: > Repository: spark > Updated Branches: > refs/heads/master 88134e736 -> 8cdb81fa8 > > > [SPARK-15204][SQL] improve nullability inference for Aggregator > > ## What changes were proposed in this pull request? > > TypedAggregateExpression sets nullable base

Re: Why's ds.foreachPartition(println) not possible?

2016-07-05 Thread Jacek Laskowski

Hi Reynold, Is this already reported and tracked somewhere. I'm quite sure that people will be asking about the reasons Spark does this. Where are such issues reported usually? Pozdrawiam, Jacek Laskowski https://medium.com/@jaceklaskowski/ Mastering Apache Spark http://bit.ly/mastering-apac

Re: [VOTE] Release Apache Spark 2.0.0 (RC1)

2016-07-05 Thread Reynold Xin

Please consider this vote canceled and I will work on another RC soon. On Tue, Jun 21, 2016 at 6:26 PM, Reynold Xin wrote: > Please vote on releasing the following candidate as Apache Spark version > 2.0.0. The vote is open until Friday, June 24, 2016 at 19:00 PDT and passes > if a majority of a

Re: Call to new JObject sometimes returns an empty R environment

2016-07-05 Thread Shivaram Venkataraman

-sparkr-dev@googlegroups +dev@spark.apache.org [Please send SparkR development questions to the Spark user / dev mailing lists. Replies inline] > From: > Date: Tue, Jul 5, 2016 at 3:30 AM > Subject: Call to new JObject sometimes returns an empty R environment > To: SparkR Developers > > > > H

Re: Why's ds.foreachPartition(println) not possible?

2016-07-05 Thread Reynold Xin

This seems like a Scala compiler bug. On Tuesday, July 5, 2016, Jacek Laskowski wrote: > Well, there is foreach for Java and another foreach for Scala. That's > what I can understand. But while supporting two language-specific APIs > -- Scala and Java -- Dataset API lost support for such simple

Re: SparkSession replace SQLContext

2016-07-05 Thread Michael Allman

These topics have been included in the documentation for recent builds of Spark 2.0. Michael > On Jul 5, 2016, at 3:49 AM, Romi Kuntsman wrote: > > You can also claim that there's a whole section of "Migrating from 1.6 to > 2.0" missing there: > https://spark.apache.org/docs/2.0.0-preview/sql

Re: Why's ds.foreachPartition(println) not possible?

2016-07-05 Thread Jacek Laskowski

Well, there is foreach for Java and another foreach for Scala. That's what I can understand. But while supporting two language-specific APIs -- Scala and Java -- Dataset API lost support for such simple calls without type annotations so you have to be explicit about the variant (since I'm using Sca

Re: Why's ds.foreachPartition(println) not possible?

2016-07-05 Thread Sean Owen

Right, should have noticed that in your second mail. But foreach already does what you want, right? it would be identical here. How these two methods do conceptually different things on different arguments. I don't think I'd expect them to accept the same functions. On Tue, Jul 5, 2016 at 3:18 PM

Re: Why's ds.foreachPartition(println) not possible?

2016-07-05 Thread Jacek Laskowski

ds is Dataset and the problem is that println (or any other one-element function) would not work here (and perhaps other methods with two variants - Java's and Scala's). Pozdrawiam, Jacek Laskowski https://medium.com/@jaceklaskowski/ Mastering Apache Spark http://bit.ly/mastering-apache-spark

Re: Why's ds.foreachPartition(println) not possible?

2016-07-05 Thread Sean Owen

A DStream is a sequence of RDDs, not of elements. I don't think I'd expect to express an operation on a DStream as if it were elements. On Tue, Jul 5, 2016 at 2:47 PM, Jacek Laskowski wrote: > Sort of. Your example works, but could you do a mere > ds.foreachPartition(println)? Why not? What shoul

Re: Why's ds.foreachPartition(println) not possible?

2016-07-05 Thread Jacek Laskowski

Sort of. Your example works, but could you do a mere ds.foreachPartition(println)? Why not? What should I even see the Java version? scala> val ds = spark.range(10) ds: org.apache.spark.sql.Dataset[Long] = [id: bigint] scala> ds.foreachPartition(println) :26: error: overloaded method value foreac

Re: Why's ds.foreachPartition(println) not possible?

2016-07-05 Thread Sean Owen

Do you not mean ds.foreachPartition(_.foreach(println)) or similar? On Tue, Jul 5, 2016 at 2:22 PM, Jacek Laskowski wrote: > Hi, > > It's with the master built today. Why can't I call > ds.foreachPartition(println)? Is using type annotation the only way to > go forward? I'd be so sad if that's th

Why's ds.foreachPartition(println) not possible?

2016-07-05 Thread Jacek Laskowski

Hi, It's with the master built today. Why can't I call ds.foreachPartition(println)? Is using type annotation the only way to go forward? I'd be so sad if that's the case. scala> ds.foreachPartition(println) :28: error: overloaded method value foreachPartition with alternatives: (func: org.apa

Re: SparkSession replace SQLContext

2016-07-05 Thread Romi Kuntsman

You can also claim that there's a whole section of "Migrating from 1.6 to 2.0" missing there: https://spark.apache.org/docs/2.0.0-preview/sql-programming-guide.html#migration-guide *Romi Kuntsman*, *Big Data Engineer* http://www.totango.com On Tue, Jul 5, 2016 at 12:24 PM, nihed mbarek wrote: >

SparkSession replace SQLContext

2016-07-05 Thread nihed mbarek

Hi, I just discover that that SparkSession will replace SQLContext for spark 2.0 JavaDoc is clear https://spark.apache.org/docs/2.0.0-preview/api/java/org/apache/spark/sql/SparkSession.html but there is no mention in sql programming guide https://spark.apache.org/docs/2.0.0-preview/sql-programming

Re: Dataset and Aggregator API pain points

[VOTE] Release Apache Spark 2.0.0 (RC2)

Re: Why's ds.foreachPartition(println) not possible?

Re: Why's ds.foreachPartition(println) not possible?

Re: Why's ds.foreachPartition(println) not possible?

Re: spark git commit: [SPARK-15204][SQL] improve nullability inference for Aggregator

Re: spark git commit: [SPARK-15204][SQL] improve nullability inference for Aggregator

Re: spark git commit: [SPARK-15204][SQL] improve nullability inference for Aggregator

Re: Why's ds.foreachPartition(println) not possible?

Re: [VOTE] Release Apache Spark 2.0.0 (RC1)

Re: Call to new JObject sometimes returns an empty R environment

Re: Why's ds.foreachPartition(println) not possible?

Re: SparkSession replace SQLContext

Re: Why's ds.foreachPartition(println) not possible?

Re: Why's ds.foreachPartition(println) not possible?

Re: Why's ds.foreachPartition(println) not possible?

Re: Why's ds.foreachPartition(println) not possible?

Re: Why's ds.foreachPartition(println) not possible?

Re: Why's ds.foreachPartition(println) not possible?

Why's ds.foreachPartition(println) not possible?

Re: SparkSession replace SQLContext

SparkSession replace SQLContext

22 matches

Site Navigation

Mail list logo

Footer information