Hi,
We figured out the issue it was due to higher value of spark.network.timeout
in our configuration after reducing this value of this parameter results are
inline with spark 3.0.1 .
thank-you for the support.
Thank-you
Prakash
From: Mich Talebzadeh
Sent: Tuesday, August 31, 2021 1:
Hello,
I have a use case where users of group id are persisted to hive table.
// pseudo code looks like below
usersRDD = sc.parallelize(..)
usersPairRDD = usersRDD.map(u => (u.groupId, u))
groupedUsers = usersPairRDD.groupByKey()
Can I save groupedUsers RDD into hive tables where table name is k
Hi Jungtaek,
thanks for your reply. I was afraid that the problem is not only on my
side but rather of conceptual nature. I guess I have to rethink my
approach. However, because you mentioned DeltaLake. I have the same
problem, but the other way around, with DeltaLake. I cannot write with
a strea
Hi,
No idea still, but noticed
"org.apache.spark.streaming.kafka010.KafkaRDDPartition" and "--jars
"spark-yarn_2.12-3.1.2.jar,spark-core_2.12-3.1.2.jar,kafka-clients-2.8.0.jar,spark-streaming-kafka-0-10_2.12-3.1.2.jar,spark-token-provider-kafka-0-10_2.12-3.1.2.jar"
\" that bothers me quite a lot.