Hi Michele,

Flink supports coGroups on sorted inputs. If you have a ds1 = DataSet[(Key,
Value1)] and ds2 = DataSet[(Key, Value2)] you obtain a sorted coGroup for
example by:

ds1.coGroup(ds2).where(0).equalsTo(0).sortFirstGroup(1,
Order.ASCENDING).sortSecondGroup(1, Order.DESCENDING)

Cheers,
Till
​

On Tue, Jul 21, 2015 at 7:27 AM, Michele Bertoni <
michele1.bert...@mail.polimi.it> wrote:

> Hi everybody, i need to execute a cogroup on sorted groups.
> I explain it better: I have two datasets i.e. (key, value), I want to
> cogroup on key and then the have both iterator sorted by value
> how can i get it?
> I know iterator should be collected to be sorted but i want to avoid it.
> what happens if i partition datasets separately by key, then sort partition
> and finally cogroup by key? can I assume they keep the order on key?
>
> which is the drawback in doing this?
> I expect to have two data shuffling one partition and one for cogroup
>
>
> thanks
>
> Best
> michele

Reply via email to