Re: Dataset : Issue with Save

2017-03-17 Thread Yong Zhang
From: Bahubali Jain Sent: Thursday, March 16, 2017 11:41 PM To: Yong Zhang Cc: user@spark.apache.org Subject: Re: Dataset : Issue with Save I am using SPARK 2.0 . There are comments in the ticket since Oct-2016 which clearly mention that issue still persists even in 2.0

Re: Dataset : Issue with Save

2017-03-16 Thread Bahubali Jain
If you are looking for workaround, the JIRA ticket clearly show you how to > increase your driver heap. 1G in today's world really is kind of small. > > > Yong > > > -- > *From:* Bahubali Jain > *Sent:* Thursday, March 16, 2017 10:34 PM > *

Re: Dataset : Issue with Save

2017-03-16 Thread Yong Zhang
, March 16, 2017 10:34 PM To: Yong Zhang Cc: user@spark.apache.org Subject: Re: Dataset : Issue with Save Hi, Was this not yet resolved? Its a very common requirement to save a dataframe, is there a better way to save a dataframe by avoiding data being sent to driver?. "Total size of serial

Re: Dataset : Issue with Save

2017-03-16 Thread Bahubali Jain
Hi, Was this not yet resolved? Its a very common requirement to save a dataframe, is there a better way to save a dataframe by avoiding data being sent to driver?. *"Total size of serialized results of 3722 tasks (1024.0 MB) is bigger than spark.driver.maxResultSize (1024.0 MB) "* Thanks, Baahu

Re: Dataset : Issue with Save

2017-03-16 Thread Yong Zhang
You can take a look of https://issues.apache.org/jira/browse/SPARK-12837 Yong Spark driver requires large memory space for serialized ... issues.apache.org Executing a sql statement with a large number of partitions requires a high memory spac