Re: Save a spark RDD to disk

2016-11-09 Thread Michael Segel
Can you increase the number of partitions and also increase the number of executors? (This should improve the parallelization but you may become disk i/o bound) On Nov 8, 2016, at 4:08 PM, Elf Of Lothlorein mailto:redarro...@gmail.com>> wrote: Hi I am trying to save a RDD to disk and I am using

Re: Save a spark RDD to disk

2016-11-08 Thread Andrew Holway
Thats around 750MB/s which seems quite respectable even in this day and age! How many and what kind of disks to you have attached to your nodes? What are you expecting? On Tue, Nov 8, 2016 at 11:08 PM, Elf Of Lothlorein wrote: > Hi > I am trying to save a RDD to disk and I am using the > saveAs