A TeraSort implementation for the current DataSet API would look a bit different from the deprecated Record API. Flink doesn't support automatic range partitioning, but by using a custom partitoner (DataSet.partitionCustom()) which range partitions (distribution of values is known) and a subsequent DataSet.sortPartition() you can do a global sort and implement a TeraSort program.
Just drop a mail if you have further questions. Cheers, Fabian 2015-06-03 17:34 GMT+02:00 Bill Sparks <jspa...@cray.com>: > Will take a look, thanks. > -- > Jonathan (Bill) Sparks > Software Architecture > Cray Inc. > > From: Chiwan Park <chiwanp...@icloud.com> > Reply-To: "user@flink.apache.org" <user@flink.apache.org> > Date: Wednesday, June 3, 2015 10:24 AM > To: "user@flink.apache.org" <user@flink.apache.org> > Subject: Re: flink terasort > > There is a terasort implementation with deprecated API. > > https://github.com/apache/flink/blob/master/flink-tests/src/test/java/org/apache/flink/test/recordJobs/sort/TeraSort.java > > AFAIK, there is no implementation with current API. > > Regards, > Chiwan Park > > > > On Jun 4, 2015, at 12:17 AM, Bill Sparks <jspa...@cray.com> wrote: > > Just asking, is there an implementation of terasort for flink? > > Regards, > Bill. > -- > Jonathan (Bill) Sparks > Software Architecture > Cray Inc. > > >