Hi guys,
thanks for the information, I'll give it a try with Algebird,
thanks again,
Richard
@Patrick, thanks for the release calendar
On Mon, Mar 24, 2014 at 12:16 AM, Patrick Wendell wrote:
> Hey All,
>
> I think the old thread is here:
> https://groups.google.com/forum/#!msg/spark-users/gVt
Hey All,
I think the old thread is here:
https://groups.google.com/forum/#!msg/spark-users/gVtOp1xaPdU/Uyy9cQz9H_8J
The method proposed in that thread is to create a utility class for
doing single-pass aggregations. Using Algebird is a pretty good way to
do this and is a bit more flexible since y
i currently typically do something like this:
scala> val rdd = sc.parallelize(1 to 10)
scala> import com.twitter.algebird.Operators._
scala> import com.twitter.algebird.{Max, Min}
scala> rdd.map{ x => (
| 1L,
| Min(x),
| Max(x),
| x
| )}.reduce(_ + _)
res0: (Long,
Hi Koert, Patrick,
do you already have an elegant solution to combine multiple operations on a
single RDD?
Say for example that I want to do a sum over one column, a count and an
average over another column,
thanks in advance,
Richard
On Mon, Mar 17, 2014 at 8:20 AM, Richard Siebeling wrote:
>
Patrick, Koert,
I'm also very interested in these examples, could you please post them if
you find them?
thanks in advance,
Richard
On Thu, Mar 13, 2014 at 9:39 PM, Koert Kuipers wrote:
> not that long ago there was a nice example on here about how to combine
> multiple operations on a single