Hi Michele, Flink supports coGroups on sorted inputs. If you have a ds1 = DataSet[(Key, Value1)] and ds2 = DataSet[(Key, Value2)] you obtain a sorted coGroup for example by:
ds1.coGroup(ds2).where(0).equalsTo(0).sortFirstGroup(1, Order.ASCENDING).sortSecondGroup(1, Order.DESCENDING) Cheers, Till On Tue, Jul 21, 2015 at 7:27 AM, Michele Bertoni < michele1.bert...@mail.polimi.it> wrote: > Hi everybody, i need to execute a cogroup on sorted groups. > I explain it better: I have two datasets i.e. (key, value), I want to > cogroup on key and then the have both iterator sorted by value > how can i get it? > I know iterator should be collected to be sorted but i want to avoid it. > what happens if i partition datasets separately by key, then sort partition > and finally cogroup by key? can I assume they keep the order on key? > > which is the drawback in doing this? > I expect to have two data shuffling one partition and one for cogroup > > > thanks > > Best > michele