date:20150531

Re: Why is RDD to PairRDDFunctions only via implicits?

2015-05-31 Thread Reynold Xin

dropping user list, adding dev. Thanks, Justin, for the poc. This is a good idea to explore, especially for Spark 2.0. On Fri, May 22, 2015 at 12:08 PM, Justin Pihony wrote: > The (crude) proof of concept seems to work: > > class RDD[V](value: List[V]){ > def doStuff = println("I'm doing st

Re: [VOTE] Release Apache Spark 1.4.0 (RC3)

2015-05-31 Thread Guoqiang Li

+1 (non-binding) -- Original -- From: "Sandy Ryza";; Date: Mon, Jun 1, 2015 07:34 AM To: "Krishna Sankar"; Cc: "Patrick Wendell"; "dev@spark.apache.org"; Subject: Re: [VOTE] Release Apache Spark 1.4.0 (RC3) +1 (non-binding) Launched against a pseudo-d

Re: [VOTE] Release Apache Spark 1.4.0 (RC3)

2015-05-31 Thread Sandy Ryza

+1 (non-binding) Launched against a pseudo-distributed YARN cluster running Hadoop 2.6.0 and ran some jobs. -Sandy On Sat, May 30, 2015 at 3:44 PM, Krishna Sankar wrote: > +1 (non-binding, of course) > > 1. Compiled OSX 10.10 (Yosemite) OK Total time: 17:07 min > mvn clean package -Pyarn

Re: Catalyst: Reusing already computed expressions within a projection

2015-05-31 Thread Reynold Xin

I think Michael's bringing up code gen because the compiler (not Spark, but javac and JVM JIT) already does common subexpression elimination, so we might get it for free during code gen. On Sun, May 31, 2015 at 11:48 AM, Justin Uang wrote: > Thanks for pointing to that link! It looks like it’s

Re: Catalyst: Reusing already computed expressions within a projection

2015-05-31 Thread Justin Uang

Thanks for pointing to that link! It looks like it’s useful, but it does look more complicated than the case I’m trying to address. In my case, we set y = f(x), then we use y later on in future projections (z = g(y)). In that case, the analysis is trivial in that we aren’t trying to find equivalen

Re: Dataframe's .drop in PySpark doesn't accept Column

2015-05-31 Thread Olivier Girardot

I understand the rational, but when you need to reference, for example when using a join, some column which name is not unique, it can be confusing in terms of API. However I figured out that you can use a "qualified" name for the column using the *other-dataframe.column_name* syntax, maybe we just

Re: Why is RDD to PairRDDFunctions only via implicits?

Re: [VOTE] Release Apache Spark 1.4.0 (RC3)

Re: [VOTE] Release Apache Spark 1.4.0 (RC3)

Re: Catalyst: Reusing already computed expressions within a projection

Re: Catalyst: Reusing already computed expressions within a projection

Re: Dataframe's .drop in PySpark doesn't accept Column

6 matches

Site Navigation

Mail list logo

Footer information