Doesn’t that just read in all the values?  The count isn’t pre-computed?
It’s not the end of the world if it’s not but would be faster.

On Mon, Jan 12, 2015 at 8:09 PM, Ganelin, Ilya <[email protected]>
wrote:

>  Use the mapPartitions function. It returns an iterator to each
> partition. Then just get that length by converting to an array.
>
>
>
> Sent with Good (www.good.com)
>
>
>
> -----Original Message-----
> *From: *Kevin Burton [[email protected]]
> *Sent: *Monday, January 12, 2015 09:55 PM Eastern Standard Time
> *To: *[email protected]
> *Subject: *quickly counting the number of rows in a partition?
>
> Is there a way to compute the total number of records in each RDD
> partition?
>
> So say I had 4 partitions.. I’d want to have
>
> partition 0: 100 records
> partition 1: 104 records
> partition 2: 90 records
> partition 3: 140 records
>
> Kevin
>
> --
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> <https://plus.google.com/102718274791889610666/posts>
>  <http://spinn3r.com>
>
> ------------------------------
>
> The information contained in this e-mail is confidential and/or
> proprietary to Capital One and/or its affiliates. The information
> transmitted herewith is intended only for use by the individual or entity
> to which it is addressed.  If the reader of this message is not the
> intended recipient, you are hereby notified that any review,
> retransmission, dissemination, distribution, copying or other use of, or
> taking of any action in reliance upon this information is strictly
> prohibited. If you have received this communication in error, please
> contact the sender and delete the material from your computer.
>



-- 

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>
<http://spinn3r.com>

Reply via email to