Hi,
I get
java.lang.NullPointerException at
org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:128)
When I try to createDataFrame using the sparkSession, see below:
SparkConf conf = new SparkConf().setMaster().setAppName("test");
conf.set("spark.driver.allowMultiple
The problem is solved.
The actual schema of Kafka message is different from documentation.
https://spark.apache.org/docs/latest/structured-streaming-kafka-integration.html
The documentation says the format of "timestamp" column is Long type,
but the actual format is timestamp.
The followings
As far as I understand updates of the custom accumulators at the
driver side happen during task completion [1].
The documentation states [2] that the very last stage in a job
consists of multiple ResultTasks, which execute the task and send its
output back to the driver application.
Also sources pr
Hi there,
Although Spark's docs state that there is a guarantee that
- accumulators in actions will only be updated once
- accumulators in transformations may be updated multiple times
... I'm wondering whether the same is true for transformations in the
last stage of the job or there is a guaran
As long as you aren't doing any spark operations that involve a
shuffle, the order you see in spark should be the same as the order in
the partition.
Can you link to a minimal code example that reproduces the issue?
On Wed, May 9, 2018 at 7:05 PM, karthikjay wrote:
> On the producer side, I make
Does spark now support python 3.5 or it is just 3.4.x?
https://spark.apache.org/docs/latest/rdd-programming-guide.html
Thank You,
Irving Duran
This is a long standing bug in Spark. –jars and –files doesn’t work in
Standalone mode
https://issues.apache.org/jira/browse/SPARK-4160
From: Marius
Date: Wednesday, May 9, 2018 at 3:51 AM
To: "user@spark.apache.org"
Subject: Spark 2.3.0 --files vs. addFile()
Hey,
i am using Spark to distribu