Re: combining operations elegantly

2014-03-24 Thread Richard Siebeling
Hi guys, thanks for the information, I'll give it a try with Algebird, thanks again, Richard @Patrick, thanks for the release calendar On Mon, Mar 24, 2014 at 12:16 AM, Patrick Wendell wrote: > Hey All, > > I think the old thread is here: > https://groups.google.com/forum/#!msg/spark-users/gVt

Re: combining operations elegantly

2014-03-23 Thread Patrick Wendell
Hey All, I think the old thread is here: https://groups.google.com/forum/#!msg/spark-users/gVtOp1xaPdU/Uyy9cQz9H_8J The method proposed in that thread is to create a utility class for doing single-pass aggregations. Using Algebird is a pretty good way to do this and is a bit more flexible since y

Re: combining operations elegantly

2014-03-23 Thread Koert Kuipers
i currently typically do something like this: scala> val rdd = sc.parallelize(1 to 10) scala> import com.twitter.algebird.Operators._ scala> import com.twitter.algebird.{Max, Min} scala> rdd.map{ x => ( | 1L, | Min(x), | Max(x), | x | )}.reduce(_ + _) res0: (Long,

Re: combining operations elegantly

2014-03-23 Thread Richard Siebeling
Hi Koert, Patrick, do you already have an elegant solution to combine multiple operations on a single RDD? Say for example that I want to do a sum over one column, a count and an average over another column, thanks in advance, Richard On Mon, Mar 17, 2014 at 8:20 AM, Richard Siebeling wrote: >

Re: combining operations elegantly

2014-03-17 Thread Richard Siebeling
Patrick, Koert, I'm also very interested in these examples, could you please post them if you find them? thanks in advance, Richard On Thu, Mar 13, 2014 at 9:39 PM, Koert Kuipers wrote: > not that long ago there was a nice example on here about how to combine > multiple operations on a single