date:20171226

RE: Which kafka client to use with spark streaming

2017-12-26 Thread Serkan TAS

Kafka Clients are blocking spark streaming jobs and after a time streaming job queue increases. -Original Message- From: Cody Koeninger [mailto:c...@koeninger.org] Sent: Tuesday, December 26, 2017 6:47 PM To: Diogo Munaro Vieira Cc: Serkan TAS ; user Subject: Re: Which kafka client to u

Standalone Cluster: ClassNotFound org.apache.kafka.common.serialization.ByteArrayDeserializer

2017-12-26 Thread Geoff Von Allmen

I am trying to deploy a standalone cluster but running into ClassNotFound errors. I have tried a whole myriad of different approaches varying from packaging all dependencies into a single JAR and using the --packages and --driver-class-path options. I’ve got a master node started, a slave node ru

Re: Apache Spark - Structured Streaming graceful shutdown

2017-12-26 Thread M Singh

Thanks Diogo. My question is how to gracefully call the stop method while the streaming application is running in a cluster. On Monday, December 25, 2017 5:39 PM, Diogo Munaro Vieira wrote: Hi M Singh! Here I'm using query.stop() Em 25 de dez de 2017 19:19, "M Singh" escreveu: Hi:

Re: NASA CDF files in Spark

2017-12-26 Thread Renato Marroquín Mogrovejo

There is also this project https://github.com/SciSpark/SciSpark It might be of interest to you Christopher. 2017-12-16 3:46 GMT-05:00 Jörn Franke : > Develop your own HadoopFileFormat and use https://spark.apache.org/ > docs/2.0.2/api/java/org/apache/spark/SparkContext. > html#newAPIHadoopRDD(o

Re: Passing an array of more than 22 elements in a UDF

2017-12-26 Thread Vadim Semenov

Functions are still limited to 22 arguments https://github.com/scala/scala/blob/2.13.x/src/library/scala/Function22.scala On Tue, Dec 26, 2017 at 2:19 PM, Felix Cheung wrote: > Generally the 22 limitation is from Scala 2.10. > > In Scala 2.11, the issue with case class is fixed, but with that s

Problem in Spark-Kafka Connector

2017-12-26 Thread Sitakant Mishra

Hi, I am trying to connect my Spark cluster to a single Kafka Topic which running as a separate process in a machine. While submitting the spark application, I am getting the following error. *17/12/25 16:56:57 ERROR TransportRequestHandler: Error sending result StreamResponse{streamId=/jars/le

Re: Passing an array of more than 22 elements in a UDF

2017-12-26 Thread Felix Cheung

Generally the 22 limitation is from Scala 2.10. In Scala 2.11, the issue with case class is fixed, but with that said I’m not sure if with UDF in Java other limitation might apply. _ From: Aakash Basu Sent: Monday, December 25, 2017 9:13 PM Subject: Re: Passing an ar

Re: Spark 2.2.1 worker invocation

2017-12-26 Thread Felix Cheung

I think you are looking for spark.executor.extraJavaOptions https://spark.apache.org/docs/latest/configuration.html#runtime-environment From: Christopher Piggott Sent: Tuesday, December 26, 2017 8:00:56 AM To: user@spark.apache.org Subject: Spark 2.2.1 worker inv

Spark 2.2.1 worker invocation

2017-12-26 Thread Christopher Piggott

I need to set java.library.path to get access to some native code. Following directions, I made a spark-env.sh: #!/usr/bin/env bash export LD_LIBRARY_PATH="/usr/local/lib/libcdfNativeLibrary.so:/usr/local/lib/libcdf.so:${LD_LIBRARY_PATH}" export SPARK_WORKER_OPTS=-Djava.library.path=/u

Re: Which kafka client to use with spark streaming

2017-12-26 Thread Cody Koeninger

Do not add a dependency on kafka-clients, the spark-streaming-kafka library has appropriate transitive dependencies. Either version of the spark-streaming-kafka library should work with 1.0 brokers; what problems were you having? On Mon, Dec 25, 2017 at 7:58 PM, Diogo Munaro Vieira wrote: > He

[Structured Streaming] Reuse computation result

2017-12-26 Thread Shu Li Zheng

Hi all, I have a scenario like this: val df = dataframe.map().filter() // agg 1 val query1 = df.sum.writeStream.start // agg 2 val query2 = df.count.writeStream.start With spark streaming, we can apply persist() on rdd to reuse the df computation result, when we call persist() after filter() ma

RE: Which kafka client to use with spark streaming

Standalone Cluster: ClassNotFound org.apache.kafka.common.serialization.ByteArrayDeserializer

Re: Apache Spark - Structured Streaming graceful shutdown

Re: NASA CDF files in Spark

Re: Passing an array of more than 22 elements in a UDF

Problem in Spark-Kafka Connector

Re: Passing an array of more than 22 elements in a UDF

Re: Spark 2.2.1 worker invocation

Spark 2.2.1 worker invocation

Re: Which kafka client to use with spark streaming

[Structured Streaming] Reuse computation result

11 matches

Site Navigation

Mail list logo

Footer information