Re: Thanks For a Job Well Done !!!

2016-06-18 Thread Reynold Xin
Thanks for the kind words, Krishna! Please keep the feedback coming. On Saturday, June 18, 2016, Krishna Sankar wrote: > Hi all, >Just wanted to thank all for the dataset API - most of the times we see > only bugs in these lists ;o). > >- Putting some context, this weekend I was updating

Thanks For a Job Well Done !!!

2016-06-18 Thread Krishna Sankar
Hi all, Just wanted to thank all for the dataset API - most of the times we see only bugs in these lists ;o). - Putting some context, this weekend I was updating the SQL chapters of my book - it had all the ugliness of SchemaRDD, registerTempTable, take(10).foreach(println) and take

Re: Does dataframe write append mode work with text format

2016-06-18 Thread Yash Sharma
Awesome!! will give it a try again. Thanks!! - Thanks, via mobile, excuse brevity. On Jun 19, 2016 11:32 AM, "Xiao Li" wrote: > Hi, Yash, > > It should work. > > val df = spark.range(1, 5) >.select('id + 1 as 'p1, 'id + 2 as 'p2, 'id + 3 as 'p3, 'id + 4 as 'p4, > 'id + 5 as 'p5, 'id as 'b

Re: Does dataframe write append mode work with text format

2016-06-18 Thread Xiao Li
Hi, Yash, It should work. val df = spark.range(1, 5) .select('id + 1 as 'p1, 'id + 2 as 'p2, 'id + 3 as 'p3, 'id + 4 as 'p4, 'id + 5 as 'p5, 'id as 'b) .selectExpr("p1", "p2", "p3", "p4", "p5", "CAST(b AS STRING) AS s").coalesce(1) df.write.partitionBy("p1", "p2", "p3", "p4", "p5").text(

Does dataframe write append mode work with text format

2016-06-18 Thread Yash Sharma
Hi All, I have been using the parquet append mode for write which works just fine. Just wanted to check if the same is supported for plain text format. The below code blows up with error saying the file already exists. {code} userEventsDF.write.mode("append").partitionBy("year", "month", "date")

Re: [VOTE] Release Apache Spark 1.6.2 (RC1)

2016-06-18 Thread Jacek Laskowski
+1 Pozdrawiam, Jacek Laskowski https://medium.com/@jaceklaskowski/ Mastering Apache Spark http://bit.ly/mastering-apache-spark Follow me at https://twitter.com/jaceklaskowski On Sat, Jun 18, 2016 at 9:13 AM, Reynold Xin wrote: > Looks like that's resolved now. > > I will wait till Sunday t

Re: Spark 2.0 Dataset Documentation

2016-06-18 Thread Pedro Rodriguez
Going to go ahead and starting working on the docs assuming this gets merged https://github.com/apache/spark/pull/13592. Opened a JIRA https://issues.apache.org/jira/browse/SPARK-16046 Having some issues building docs. The Java docs fail to build. Output when it fails is here: https://gist.github.

Re: Spark 2.0 Dataset Documentation

2016-06-18 Thread Jacek Laskowski
On Sat, Jun 18, 2016 at 6:13 AM, Pedro Rodriguez wrote: > using Datasets (eg using $ to select columns). Or even my favourite one - the tick ` :-) Jacek - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional

Re: [VOTE] Release Apache Spark 1.6.2 (RC1)

2016-06-18 Thread Reynold Xin
Looks like that's resolved now. I will wait till Sunday to cut rc2 to give people more time to find issues with rc1. On Fri, Jun 17, 2016 at 10:58 AM, Marcelo Vanzin wrote: > -1 (non-binding) > > SPARK-16017 shows a severe perf regression in YARN compared to 1.6.1. > > On Thu, Jun 16, 2016 at