Re: flink terasort

Fabian Hueske Wed, 03 Jun 2015 14:41:43 -0700

A TeraSort implementation for the current DataSet API would look a bit
different from the deprecated Record API.
Flink doesn't support automatic range partitioning, but by using a custom
partitoner (DataSet.partitionCustom()) which range partitions (distribution
of values is known) and a subsequent DataSet.sortPartition() you can do a
global sort and implement a TeraSort program.


Just drop a mail if you have further questions.

Cheers, Fabian

2015-06-03 17:34 GMT+02:00 Bill Sparks <jspa...@cray.com>:

>  Will take a look, thanks.
>  --
>  Jonathan (Bill) Sparks
> Software Architecture
> Cray Inc.
>
>   From: Chiwan Park <chiwanp...@icloud.com>
> Reply-To: "user@flink.apache.org" <user@flink.apache.org>
> Date: Wednesday, June 3, 2015 10:24 AM
> To: "user@flink.apache.org" <user@flink.apache.org>
> Subject: Re: flink terasort
>
>   There is a terasort implementation with deprecated API.
>
> https://github.com/apache/flink/blob/master/flink-tests/src/test/java/org/apache/flink/test/recordJobs/sort/TeraSort.java
>
> AFAIK, there is no implementation with current API.
>
> Regards,
> Chiwan Park
>
>
>
>  On Jun 4, 2015, at 12:17 AM, Bill Sparks <jspa...@cray.com> wrote:
>
>   Just asking, is there an implementation of terasort for flink?
>
>  Regards,
>    Bill.
>  --
>  Jonathan (Bill) Sparks
> Software Architecture
> Cray Inc.
>
>
>

Re: flink terasort

Reply via email to