top returns the specified number of "largest" elements in your RDD.
They are returned to the driver as an Array. If you want to make an
RDD out of them again, call SparkContext.parallelize(...). Make sure
this is what you mean though.
On Wed, Sep 24, 2014 at 5:33 AM, Deep Pradhan wrote:
> Hi,
> I
Here is my understanding
def takeOrdered(num: Int)(implicit ord: Ordering[T]): Array[T] = {
if (num == 0) { //if 0, return empty array
Array.empty
} else {
mapPartitions { items => //map each partition to a a new one
with the iterator consists of the single queue, wh
Hi,
Is it always possible to get one RDD from another.
For example, if I do a *top(K)(Ordering)*, I get an Int right? (In my
example the type is Int). I do not get an RDD.
Can anyone explain this to me?
Thank You