Re: Reuse ConsumerConnector

2014-12-10 Thread YuanJia Li
hi Chen, Maybe you can take some tips from kafka.tools.KafkaMigrationTool.java, it's also a multi-threads using case. Reusing the same ConsumerConnector every time is ok. If you create ConsumerConnector repeatedly with the same consumer.id, the conflict will happen in ZK. Yuanjia Li From: Chen

Re: how to achieve availability with no data loss when replica = 1?

2014-12-10 Thread Helin Xiang
Yes, i mean the replication-factor == 1, because the data volumn is huge for some special topic and we are more care about the disk and network resource. So in your opinion, we can't have durability and availability as long as the replication-factor == 1. Unless we implement another type of produc

Re: how to achieve availability with no data loss when replica = 1?

2014-12-10 Thread Joe Stein
By replica == 1 do you mean replication-factor == 1 or something different? You should have replication-factor == 3 if you are trying to have durable writes survive failure. On the producer side set ack = -1 with that for it to work as expected. On Wed, Dec 10, 2014 at 7:14 PM, Helin Xiang wrote

Re: how to achieve availability with no data loss when replica = 1?

2014-12-10 Thread Helin Xiang
Thanks for the reply , Joe. In my opinion, when replica == 1, the ack == -1 would cause producer stopping sending any data to kafka cluster if 1 broker is down. That means we could not bear single point of failure. Am I right? What we want is when 1 broker down, and the topic replica is set to 1,

Re: OutOfMemoryException when starting replacement node.

2014-12-10 Thread Solon Gordon
I see, thank you for the explanation. You might consider being more explicit about this in your documentation. We didn't realize we needed to take the (partitions * fetch size) calculation into account when choosing partition counts for our topics, so this is a bit of a rude surprise. On Wed, Dec

JMX mbean of Gauge[Long] is a java.lang.Object

2014-12-10 Thread Jack Foy
Kafka 0.8.2-beta consumer, 0.8.1 broker. In our consumer, we retrieve the mbeans "kafka.server":name="*-ConsumerLag",type="FetcherLagMetrics" in order to send partition lag data to our monitoring service. The underlying metrics object is a Gauge[Long], but the "Value" attribute returned by JMX

New Go Kafka Client

2014-12-10 Thread Joe Stein
Hi, we open sourced a new Go Kafka Client http://github.com/stealthly/go_kafka_client some more info on blog post http://allthingshadoop.com/2014/12/10/making-big-data-go/ for those working with Go or looking to get into Go and Apache Kafka. /*** Joe Stein F

Re: How to raise a question in forum

2014-12-10 Thread nitin sharma
Hello Otis, actually my operation team told me that they are seeing following JMX MBean on one of the broker. " kafka.server* >* ReplicaFetcherThread-0-2-host_*<>*-port_ **-*<>*-*<>*-ConsumerLag this is what confusing me.. and when i monitored this Mbean, i saw values coming in negative. Regar

Re: How to raise a question in forum

2014-12-10 Thread Otis Gospodnetic
Hi, On Wed, Dec 10, 2014 at 12:02 AM, nitin sharma wrote: > thanks for response. But my questions are still not answered > > a. Is ConsumerLag is the metrics that is exposed on kafka broker end or on > Consumer side > Consumer. > b. Can ConsumerLag value be in negative? > Shouldn't be. Not

Re: OutOfMemoryException when starting replacement node.

2014-12-10 Thread Gwen Shapira
Ah, found where we actually size the request as partitions * fetch size. Thanks for the correction, Jay and sorry for the mix-up, Solon. On Wed, Dec 10, 2014 at 10:41 AM, Jay Kreps wrote: > Hey Solon, > > The 10MB size is per-partition. The rationale for this is that the fetch > size per-partiti

Reuse ConsumerConnector

2014-12-10 Thread Chen Wang
Hey Guys, I have a user case that my thread reads from different kafka topic periodically through a timer. The way I am reading from kafka in the timer callback is the following: try { Map>> consumerMap = consumerConnector .createMessageStreams(topicCountMap); List> streamList = consumerMap

Re: OutOfMemoryException when starting replacement node.

2014-12-10 Thread Jay Kreps
Hey Solon, The 10MB size is per-partition. The rationale for this is that the fetch size per-partition is effectively a max message size. However with so many partitions on one machine this will lead to a very large fetch size. We don't do a great job of scheduling these to stay under a memory bou

Re: OutOfMemoryException when starting replacement node.

2014-12-10 Thread Gwen Shapira
If you have replica.fetch.max.bytes set to 10MB, I would not expect 2GB allocation in BoundedByteBufferReceive when doing a fetch. Sorry, out of ideas on why this happens... On Wed, Dec 10, 2014 at 8:41 AM, Solon Gordon wrote: > Thanks for your help. We do have replica.fetch.max.bytes set to 10M

Re: OutOfMemoryException when starting replacement node.

2014-12-10 Thread Solon Gordon
Thanks for your help. We do have replica.fetch.max.bytes set to 10MB to allow larger messages, so perhaps that's related. But should that really be big enough to cause OOMs on an 8GB heap? Are there other broker settings we can tune to avoid this issue? On Wed, Dec 10, 2014 at 11:05 AM, Gwen Shapi

Re: how to achieve availability with no data loss when replica = 1?

2014-12-10 Thread Joe Stein
If you want no data loss then you need to set ack = -1 Copied from https://kafka.apache.org/documentation.html#producerconfigs == -1, which means that the producer gets an acknowledgement after all in-sync replicas have received the data. This option provides the best durability, we guarantee that

how to achieve availability with no data loss when replica = 1?

2014-12-10 Thread Helin Xiang
Hi, in some topics of our system, the data volumn is so huge that we think doing extra replica is a waste of disk and network resource( plus the data is not so important). firstly, we use 1 replica + ack=0, found when 1 broker is down, the data would loss 1/n. then we tried 1 replica + ack=1, and

Re: OutOfMemoryException when starting replacement node.

2014-12-10 Thread Gwen Shapira
There is a parameter called replica.fetch.max.bytes that controls the size of the messages buffer a broker will attempt to consume at once. It defaults to 1MB, and has to be at least message.max.bytes (so at least one message can be sent). If you try to support really large messages and increase t

Re: leaderless topicparts after single node failure: how to repair?

2014-12-10 Thread Gwen Shapira
It looks like none of your replicas are in-sync. Did you enable unclean leader election? This will allow one of the un-synced replicas to become leader, leading to data loss but maintaining availability of the topic. Gwen On Tue, Dec 9, 2014 at 8:43 AM, Neil Harkins wrote: > Hi. We've suffered

Re: OutOfMemoryException when starting replacement node.

2014-12-10 Thread Solon Gordon
I just wanted to bump this issue to see if anyone has thoughts. Based on the error message it seems like the broker is attempting to consume nearly 2GB of data in a single fetch. Is this expected behavior? Please let us know if more details would be helpful or if it would be better for us to file

Re: How to Ingest data into kafka

2014-12-10 Thread nitin sharma
Hi Kishore, You can use Kafka Producer API for same. You can find the sample code on Kafka Quick Start page : http://kafka.apache.org/07/quickstart.html Regards, Nitin Kumar Sharma. On Wed, Dec 10, 2014 at 2:14 AM, kishore kumar wrote: > Hi kafkars, > > I want to write a java code to ingest

Re: Need to use Kafka with spark

2014-12-10 Thread David Morales de Frías
This is a very nice intro to spark and kafka, with some valuable details about partitioning and parallelism http://www.michael-noll.com/blog/2014/10/01/kafka-spark-streaming-integration-example-tutorial/ 2014-12-05 21:31 GMT+01:00 sanjeeb kumar : > Hi Team, > I am able to install Kafka in Ub