cutting 2.0.2?

2016-10-16 Thread Reynold Xin
Since 2.0.1, there have been a number of correctness fixes as well as some nice improvements to the experimental structured streaming (notably basic Kafka support). I'm thinking about cutting 2.0.2 later this week, before Spark Summit Europe. Let me know if there are specific things (bug fixes) you

Re: Spark Improvement Proposals

2016-10-16 Thread Debasish Das
Thanks Cody for bringing up a valid point...I picked up Spark in 2014 as soon as I looked into it since compared to writing Java map-reduce and Cascading code, Spark made writing distributed code fun...But now as we went deeper with Spark and real-time streaming use-case gets more prominent, I thin

Re: Spark Improvement Proposals

2016-10-16 Thread Tomasz Gawęda
Hi everyone, I'm quite late with my answer, but I think my suggestions may help a little bit. :) Many technical and organizational topics were mentioned, but I want to focus on these negative posts about Spark and about "haters" I really like Spark. Easy of use, speed, very good community - it'

Re: Apache Spark chat channel

2016-10-16 Thread Dean Wampler
Okay, here is a Gitter room for this purpose: https://gitter.im/spark-scala/Lobby If you use the APIs, please join and help those who are learning. I can't answer every question. dean Dean Wampler, Ph.D. Author: Programming Scala, 2nd Edition (

Re: Why the json file used by sparkSession.read.json must be a valid json object per line

2016-10-16 Thread WangJianfei
thank you! But I think is's user unfriendly to process standard json file with DataFrame. Need we provide a new overrided method to do this? -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Why-the-json-file-used-by-sparkSession-read-json-must-be-a-val

Re: Why the json file used by sparkSession.read.json must be a valid json object per line

2016-10-16 Thread trsell
Think of it as jsonl instead of a json file. Point people at this if they need an official looking spec: http://jsonlines.org/ One good reason for using this format is you can split mid file easily. This make it work well with standard unix tools in pipes. On Sun, 16 Oct 2016 at 16:24 WangJianfe

Re: Why the json file used by sparkSession.read.json must be a valid json object per line

2016-10-16 Thread WangJianfei
Thank you very much! I will have a look about your link. -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Why-the-json-file-used-by-sparkSession-read-json-must-be-a-valid-json-object-per-line-tp19464p19466.html Sent from the Apache Spark Developers Lis