Hi, as far as I can tell there is no direct equivalent, which is probably due to the underlying execution models.
I think the desired behaviour can be expressed by something along the lines of: stream.groupBy(0).window(Count.of(<size>)) where: stream is a DataStream<Tuple2<K, V>> and <size> would be the batch size of your SparkStreaming job. The window can also be expressed in terms of time which would look something like this: .window(Time.of(<time>, <time_unit>)) You can find slides on the streaming API at [1] and there is a number of examples at [2] best regards, martin [1] http://dataartisans.github.io/flink-training/dataStreamBasics/intro.html [2] https://github.com/dataArtisans/flink-training-exercises/tree/master/src/main/java/com/dataArtisans/flinkTraining/exercises/dataStreamJava Liang Chen <chenliang...@huawei.com> schrieb am Sa., 12. Sep. 2015 um 05:53 Uhr: > Hi > > Now i am considering migrate Sparkstreaming case to Flink for comparing > performance. > > Does flink support groupByKey([numTasks]) ,When called on a dataset of (K, > V) pairs, returns a dataset of (K, Iterable<V>) pairs. > If it is not exist, how to use groupBy() to implement the same function? > > > > -- > View this message in context: > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Does-flink-support-groupByKey-numTasks-tp7973.html > Sent from the Apache Flink Mailing List archive. mailing list archive at > Nabble.com. >