Hi, Was this not yet resolved? Its a very common requirement to save a dataframe, is there a better way to save a dataframe by avoiding data being sent to driver?.
*"Total size of serialized results of 3722 tasks (1024.0 MB) is bigger than spark.driver.maxResultSize (1024.0 MB) "* Thanks, Baahu On Fri, Mar 17, 2017 at 1:19 AM, Yong Zhang <java8...@hotmail.com> wrote: > You can take a look of https://issues.apache.org/jira/browse/SPARK-12837 > > > Yong > Spark driver requires large memory space for serialized ... > <https://issues.apache.org/jira/browse/SPARK-12837> > issues.apache.org > Executing a sql statement with a large number of partitions requires a > high memory space for the driver even there are no requests to collect data > back to the driver. > > > > ------------------------------ > *From:* Bahubali Jain <bahub...@gmail.com> > *Sent:* Thursday, March 16, 2017 1:39 PM > *To:* user@spark.apache.org > *Subject:* Dataset : Issue with Save > > Hi, > While saving a dataset using * > mydataset.write().csv("outputlocation") * I am running > into an exception > > > > * "Total size of serialized results of 3722 tasks (1024.0 MB) is bigger > than spark.driver.maxResultSize (1024.0 MB)" * > Does it mean that for saving a dataset whole of the dataset contents are > being sent to driver ,similar to collect() action? > > Thanks, > Baahu > -- Twitter:http://twitter.com/Baahu