Hi,
I'm trying to add spark-streaming to our kafka topic. But, I keep getting
this error
java.lang.AssertionError: assertion failed: Failed to get record after
polling for 512 ms.
I tried to add different params like max.poll.interval.ms,
spark.streaming.kafka.consumer.poll.ms to 1ms in kafka
Hi all,
Im performing spark submit using Spark rest api POST operation on 6066 port
with below config
> Launch Command:
> "/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.141-1.b16.el7_3.x86_64/jre/bin/java"
> "-cp" "/usr/local/spark/conf/:/usr/local/spark/jars/*" "-Xmx4096M"
> "-Dspark.eventLog.enabled=t
Thanks all. I am not submitting a spark job explicitly. Instead, I am using
the Spark library functionality embedded in my web service as shown in the
code I included in the previous email. So, effectively Spark SQL runs in
the web service's JVM. Therefore, --driver-memory option would not (and did
Yes!
Thanks!
~Kedar Dixit
Bigdata Analytics at Persistent Systems Ltd.
From: Holden Karau
Sent: 14 November 2017 20:04:50
To: Kedarnath Dixit
Cc: user@spark.apache.org
Subject: Re: Use of Accumulators
And where do you want to read the toggle back from? On the
And where do you want to read the toggle back from? On the driver?
On Tue, Nov 14, 2017 at 3:52 AM Kedarnath Dixit <
kedarnath_di...@persistent.com> wrote:
> Hi,
>
>
>
> Inside the transformation if there is any change for the Variable’s
> associated data, I want to just toggle it saying there is
Hi,
I have a problem with Structured Streaming and Kafka.
I have 2 brokers and a topic with 8 partitions and replication factor 2.
This is my driver program:
public static void main(String[] args)
{
SparkSession spark = SparkSession
.builder()
.app
Without knowing anything about your pipeline the best estimate of the resources
needed is to run the job with same ingestion rate as the normal production load.
With kafka you can enable back pressure so with high load also your latency
will just increase but you don’t have to have capacity for
Hi,
I was wondering if anyone has done some work around measuring the cluster
resource utilization of a "typical" spark streaming job.
We are trying to build a message ingestion system which will read from
Kafka and do some processing. We have had some concerns raised in the team
that a 24*7 str
Hi,
Inside the transformation if there is any change for the Variable’s associated
data, I want to just toggle it saying there is some change while processing the
data.
Please let me know if we can runtime do this.
Thanks!
~Kedar Dixit
Bigdata Analytics at Persistent Systems Ltd.
From: Holde