how can I dynamic parse json in kafka when using Structured Streaming

2019-09-16 Thread lk_spark
hi,all : I'm using Structured Streaming to read kafka , the data type is json String , I want to parse it and conver to a datafrme , my code can't pass compile , I don't know how to fix it: val lines = messages.selectExpr("CAST(value AS STRING) as value").as[String] val words = lines.map(

Re: [EXTERNAL] Re: Conflicting PySpark Storage Level Defaults?

2019-09-16 Thread grp
Running a simple test - here is the stack overflow code snippet using .count() as the action. You can see the differences between the storage levels. print(spark.version) 2.4.3 # id 3 => using default storage level for df (memory_and_disk) and unsure why storage level is not serialized since i

Can anyone suggest what is wrong with my spark job here?

2019-09-16 Thread Shyam P
Hi , Though my spark-job working fine in my local in spark cluster it has issue . Can anyone suggest me what is wrong here ? https://stackoverflow.com/questions/57960569/accessing-external-yml-file-in-my-spark-job-code-not-working-throwing-cant-con Regards, Shyam

Unable to verify in-transit encryption

2019-09-16 Thread G R
This is a duplicate of my stack overflow question here: https://stackoverflow.com/questions/57881044/verifying-in-transit-encryption-for-spark-shuffle I'm running Spark over YARN on AWS EMR 5.20. I've followed the following guide for running in-transit encryption for spark shuffle: https://docs

Re: Conflicting PySpark Storage Level Defaults?

2019-09-16 Thread Jörn Franke
I don’t know your full source code but you may missing an action so that it is indeed persisted. > Am 16.09.2019 um 02:07 schrieb grp : > > Hi There Spark Users, > > Curious what is going on here. Not sure if possible bug or missing > something. Extra eyes are much appreciated. > > Spark UI