Re: SPIP: Support Kafka delegation token in Structured Streaming

2018-09-29 Thread Saisai Shao
I like this proposal. Since Kafka already provides delegation token mechanism, we can also leverage Spark's delegation token framework to add Kafka as a built-in support. BTW I think there's no much difference in support structured streaming and DStream, maybe we can set both as goal. Thanks Sais

Re: Python friendly API for Spark 3.0

2018-09-29 Thread Stavros Kontopoulos
Regarding Python 3.x upgrade referenced earlier. Some people already gone down that path of upgrading: https://blogs.dropbox.com/tech/2018/09/how-we-rolled-out-one-of-the-largest-python-3-migrations-ever They describe some good reasons. Stavros On Tue, Sep 18, 2018 at 6:35 PM, Erik Erlandson w

Re: saveAsTable in 2.3.2 throws IOException while 2.3.1 works fine?

2018-09-29 Thread Sean Owen
Looks like a permission issue? Are you sure that isn't the difference, first? On Sat, Sep 29, 2018, 1:54 PM Jacek Laskowski wrote: > Hi, > > The following query fails in 2.3.2: > > scala> spark.range(10).write.saveAsTable("t1") > ... > 2018-09-29 20:48:06 ERROR FileOutputCommitter:314 - Mkdirs f

saveAsTable in 2.3.2 throws IOException while 2.3.1 works fine?

2018-09-29 Thread Jacek Laskowski
Hi, The following query fails in 2.3.2: scala> spark.range(10).write.saveAsTable("t1") ... 2018-09-29 20:48:06 ERROR FileOutputCommitter:314 - Mkdirs failed to create file:/user/hive/warehouse/bucketed/_temporary/0 2018-09-29 20:48:07 ERROR Utils:91 - Aborting task java.io.IOException: Mkdirs fai

Re: [VOTE] SPARK 2.4.0 (RC2)

2018-09-29 Thread Stavros Kontopoulos
+1 Stavros On Sat, Sep 29, 2018 at 5:59 AM, Sean Owen wrote: > +1, with comments: > > There are 5 critical issues for 2.4, and no blockers: > SPARK-25378 ArrayData.toArray(StringType) assume UTF8String in 2.4 > SPARK-25325 ML, Graph 2.4 QA: Update user guide for new features & APIs > SPARK-2531

Re: SPIP: Support Kafka delegation token in Structured Streaming

2018-09-29 Thread Jungtaek Lim
Hi Gabor, Thanks for proposing the feature. I'm definitely interested to see this feature, but honestly I'm not familiar with how Spark deals with delegation token for HDFS and HBase. I'll try to review the doc in general, and try to learn it, and review again based on understanding. Thanks, Jung