subject:"KafkaUtils not consuming all the data from all partitions"

Re: KafkaUtils not consuming all the data from all partitions

2015-01-07 Thread Gerard Maas

AFAIK, there're two levels of parallelism related to the Spark Kafka consumer: At JVM level: For each receiver, one can specify the number of threads for a given topic, provided as a map [topic -> nthreads]. This will effectively start n JVM threads consuming partitions of that kafka topic. At Cl

Re: KafkaUtils not consuming all the data from all partitions

2015-01-07 Thread francois . garillot

- You are launching up to 10 threads/topic per Receiver. Are you sure your receivers can support 10 threads each ? (i.e. in the default configuration, do they have 10 cores). If they have 2 cores, that would explain why this works with 20 partitions or less. - If you have 90 partitions, why

Re: KafkaUtils not consuming all the data from all partitions

2015-01-07 Thread Mukesh Jha

I understand that I've to create 10 parallel streams. My code is running fine when the no of partitions is ~20, but when I increase the no of partitions I keep getting in this issue. Below is my code to create kafka streams, along with the configs used. Map kafkaConf = new HashMap(); kafk

Re: KafkaUtils not consuming all the data from all partitions

2015-01-07 Thread Gerard Maas

Hi, Could you add the code where you create the Kafka consumer? -kr, Gerard. On Wed, Jan 7, 2015 at 3:43 PM, wrote: > Hi Mukesh, > > If my understanding is correct, each Stream only has a single Receiver. > So, if you have each receiver consuming 9 partitions, you need 10 input > DStreams to c

Re: KafkaUtils not consuming all the data from all partitions

2015-01-07 Thread francois . garillot

Hi Mukesh, If my understanding is correct, each Stream only has a single Receiver. So, if you have each receiver consuming 9 partitions, you need 10 input DStreams to create 10 concurrent receivers: https://spark.apache.org/docs/latest/streaming-programming-guide.html#level-of-parallelism

KafkaUtils not consuming all the data from all partitions

2015-01-07 Thread Mukesh Jha

Hi Guys, I have a kafka topic having 90 partitions and I running SparkStreaming(1.2.0) to read from kafka via KafkaUtils to create 10 kafka-receivers. My streaming is running fine and there is no delay in processing, just that some partitions data is never getting picked up. From the kafka consol

Re: KafkaUtils not consuming all the data from all partitions

Re: KafkaUtils not consuming all the data from all partitions

Re: KafkaUtils not consuming all the data from all partitions

Re: KafkaUtils not consuming all the data from all partitions

Re: KafkaUtils not consuming all the data from all partitions

KafkaUtils not consuming all the data from all partitions

6 matches

Site Navigation

Mail list logo

Footer information