If you mean your values are all a Seq or similar already, then you just
take the top 1 ordered by the size of the value:

rdd.top(1)(Ordering.by(_._2.size))


On Thu, Aug 28, 2014 at 9:34 AM, Deep Pradhan <pradhandeep1...@gmail.com>
wrote:

> Hi,
> I have a RDD of key-value pairs. Now I want to find the "key" for which
> the "values" has the largest number of elements. How should I do that?
> Basically I want to select the key for which the number of items in values
> is the largest.
> Thank You
>

Reply via email to