Re: coGroup Iterator NoSuchElement

2015-06-03 Thread Stephan Ewen
Hi! The code snippet is not very revealing. Can you also shot the implementations of the CoGroupFunctions? Thanks! On Wed, Jun 3, 2015 at 3:50 PM, Mustafa Elbehery wrote: > Code Snippet :) > > DataSet updatedPersonOne = inPerson.coGroup(inStudent) >.where("name").eq

Re: flink terasort

2015-06-03 Thread Fabian Hueske
A TeraSort implementation for the current DataSet API would look a bit different from the deprecated Record API. Flink doesn't support automatic range partitioning, but by using a custom partitoner (DataSet.partitionCustom()) which range partitions (distribution of values is known) and a subsequent

Re: bug? open() not getting called with RichWindowMapFunction

2015-06-03 Thread Robert Metzger
Since Flink is a community driven open source project, there is no fixed timeline. But there is a thread on this mailing list discussing the next release. My gut feeling (it depends on the community) tells me that we'll have a 0.9 release in the next one or two weeks. On Tue, May 26, 2015 at 7:34

Re: Problem with Amazon S3

2015-06-03 Thread Robert Metzger
Flink allows to use Hadoop's FileSystem interface as well [1]. Hadoop actually ships a s3 file system implementation by default, and I suspect its in a better shape than Flink's implementation. Maybe it would make sense to use Hadoop's S3 implementation through Flink's Hadoop FS support. Please l

Re: count the k-means iteration

2015-06-03 Thread Robert Metzger
Sorry for the late reply. Operators part of an iteration allow you to get the current iteration id like this: getIterationRuntimeContext().getSuperstepNumber() To get the IterationRuntimeContext, the function needs to implement the Rich* interface. On Tue, May 26, 2015 at 4:01 PM, Pa Rö wrote: >

Re: flink terasort

2015-06-03 Thread Bill Sparks
Will take a look, thanks. -- Jonathan (Bill) Sparks Software Architecture Cray Inc. From: Chiwan Park mailto:chiwanp...@icloud.com>> Reply-To: "user@flink.apache.org" mailto:user@flink.apache.org>> Date: Wednesday, June 3, 2015 10:24 AM To: "user@flink.apache.org

Re: flink terasort

2015-06-03 Thread Chiwan Park
There is a terasort implementation with deprecated API. https://github.com/apache/flink/blob/master/flink-tests/src/test/java/org/apache/flink/test/recordJobs/sort/TeraSort.java AFAIK, there is no implementation with current API. Regards, Chiwan Park > On Jun 4, 2015, at 12:17 AM, Bill Sparks

flink terasort

2015-06-03 Thread Bill Sparks
Just asking, is there an implementation of terasort for flink? Regards, Bill. -- Jonathan (Bill) Sparks Software Architecture Cray Inc.

Re: coGroup Iterator NoSuchElement

2015-06-03 Thread Mustafa Elbehery
Code Snippet :) DataSet updatedPersonOne = inPerson.coGroup(inStudent) .where("name").equalTo("name") .with(new ComputeStudiesProfile()); DataSet updatedPersonTwo = updatedPersonOne.coGroup(inJobs) .where("name").equ

coGroup Iterator NoSuchElement

2015-06-03 Thread Mustafa Elbehery
Hi, I am trying to write two coGrouprs in sequence on the same ETL .. In use common dataset in both of them, in the first coGroup I update the initial dataset and retrieve the result in a new dataset object. Then I use the result in the second coGroup with another new dataset. While debugging, I

Re: repartion locally to task manager

2015-06-03 Thread Stephan Ewen
Hi Ventura! Sorry for the late response. Here are a few ideas or comments that may help you: 1) We want to make it possible for a function (such as MapFunction) to figure out on which TaskManager it is running. The mechanism would be something like "getRuntimeContext().getTaskManagerInformation()