Re: Comparative study

Daniel Siegmann Tue, 08 Jul 2014 06:59:48 -0700

In addition to Scalding and Scrunch, there is Scoobi. Unlike the others, it
is only Scala (it doesn't wrap a Java framework). All three have fairly
similar APIs and aren't too different from Spark. For example, instead of
RDD you have DList (distributed list) or PCollection (parallel collection)
- or in Scalding's case, Pipe, because Cascading had to get cute with its
names.



On Mon, Jul 7, 2014 at 8:12 PM, Sean Owen <so...@cloudera.com> wrote:

> On Tue, Jul 8, 2014 at 1:05 AM, Nabeel Memon <nm3...@gmail.com> wrote:
>
>> For Scala API on map/reduce (hadoop engine) there's a library called
>> "Scalding". It's built on top of Cascading. If you have a huge dataset or
>> if you consider using map/reduce engine for your job, for any reason, you
>> can try Scalding.
>>
>
> PS Crunch also has a Scala API called Scrunch. And Crunch can run its jobs
> on Spark too, not just M/R.
>
>
>


-- 
Daniel Siegmann, Software Developer
Velos
Accelerating Machine Learning

440 NINTH AVENUE, 11TH FLOOR, NEW YORK, NY 10001
E: daniel.siegm...@velos.io W: www.velos.io

Re: Comparative study

Reply via email to