; >
> > I think you can use MapPartition for that.
> > So basically:
> >
> > dataset // assuming some partitioning that can be reused to avoid a
> shuffle
> > .sortPartition(1, Order.DESCENDING)
> > .mapPartition(new ReturnFirstTen())
> > .
Hi,
I have a dataset of tuples with two fields ids and ratings and I need to
find 10 elements with the highest rating in this dataset. I found a
solution, but I think it's suboptimal and I think there should be a better
way to do it.
The best thing that I came up with is to partition dataset by r