Hi Walrus,

Try caching the results just before calling the rdd.count.

Regards,
Ajay

> On Nov 13, 2015, at 7:56 PM, Walrus theCat <[email protected]> wrote:
> 
> Hi,
> 
> I have an RDD which crashes the driver when being collected.  I want to send 
> the data on its partitions out to S3 without bringing it back to the driver. 
> I try calling rdd.foreachPartition, but the data that gets sent has not gone 
> through the chain of transformations that I need.  It's the data as it was 
> ingested initially.  After specifying my chain of transformations, but before 
> calling foreachPartition, I call rdd.count in order to force the RDD to 
> transform.  The data it sends out is still not transformed.  How do I get the 
> RDD to send out transformed data when calling foreachPartition?
> 
> Thanks

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to