Re: KafkaUtils not consuming all the data from all partitions

2015-01-07 Thread Mukesh Jha
, wrote: > >> Hi Mukesh, >> >> If my understanding is correct, each Stream only has a single Receiver. >> So, if you have each receiver consuming 9 partitions, you need 10 input >> DStreams to create 10 concurrent receivers: >> >> >> https://spark.ap

KafkaUtils not consuming all the data from all partitions

2015-01-07 Thread Mukesh Jha
000"); kafkaConf.put("zookeeper.session.timeout.ms", "6000"); kafkaConf.put("zookeeper.connection.timeout.ms", "6000"); kafkaConf.put("zookeeper.sync.time.ms", "2000"); kafkaConf.put("rebalance.backoff.ms", "1"); kafkaConf.put("rebalance.max.retries", "20"); -- Thanks & Regards, *Mukesh Jha *

Re: Kafka getMetadata api

2015-01-02 Thread Mukesh Jha
Indeed my message size varies b/w ~500kb to ~5mb per avro. I am using kafka as a I need a scalable pub-sub messaging architecture with multiple produces and consumers and guaranty of delivery. Keeping data on filesystem or hdfs won't give me that. Also In the link below [1] there is a linkedin's

Re: Kafka getMetadata api

2015-01-02 Thread Mukesh Jha
gt; > > > One option is to partition the data using key and consume from relevant > > partition. > > Or your current approach (filtering messages in the application) should > be > > OK. > > > > Using separate getMetaData/getkey and getMessage may hit the consume

Re: Kafka getMetadata api

2015-01-02 Thread Mukesh Jha
Any pointers guys? On 1 Jan 2015 15:26, "Mukesh Jha" wrote: > Hello Experts, > > I'm using a kafka topic to store bunch of messages where the key contains > metadata and value is the data (avro file in our case). > There are multiple consumers for each topic and th

Kafka getMetadata api

2015-01-01 Thread Mukesh Jha
w what you all think. Thanks for your help & suggestions. -- Thanks & Regards, *Mukesh Jha *