> On 10 Feb 2016, at 10:56, Eli Super <eli.su...@gmail.com> wrote: > > Hi > > I work with pyspark & spark 1.5.2 > > Currently saving rdd into csv file is very very slow , uses 2% CPU only > > I use : > my_dd.write.format("com.databricks.spark.csv").option("header", > "false").save('file:///my_folder') > > Is there a way to save csv faster ? > > Many thanks
on a linux box you can run "iotop" to see what's happening on the disks -it may just be that that the disk is the bottleneck --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org