How to achieve distributed processing and high availability simultaneously in Kafka?

2015-05-05 Thread sumit jain
I have a topic consisting of n partitions. To have distributed processing I create two processes running on different machines. They subscribe to the topic with same groupd id and allocate n/2 threads, each of which processes single stream(n/2 partitions per process). With this I will have achieve

Re: circuit breaker for producer

2015-05-05 Thread Guozhang Wang
1. KAFKA-1955 , I think Jay has a WIP patch for it. 2. 3. On Tue, May 5, 2015 at 5:10 PM, Jason Rosenberg wrote: > Guozhang, > > Do you have the ticket number for possibly adding in local log file > failover? Is it actively being wor

Re: New producer: metadata update problem on 2 Node cluster.

2015-05-05 Thread Ewen Cheslack-Postava
I'm not sure about the old producer behavior in this same failure scenario, but creating a new producer instance would resolve the issue since it would start with the list of bootstrap nodes and, assuming at least one of them was up, it would be able to fetch up to date metadata. On Tue, May 5, 20

Re: New producer: metadata update problem on 2 Node cluster.

2015-05-05 Thread Jason Rosenberg
Can you clarify, is this issue here specific to the "new" producer? With the "old" producer, we routinely construct a new producer which makes a fresh metadata request (via a VIP connected to all nodes in the cluster). Would this approach work with the new producer? Jason On Tue, May 5, 2015 at

Re: circuit breaker for producer

2015-05-05 Thread Jason Rosenberg
Guozhang, Do you have the ticket number for possibly adding in local log file failover? Is it actively being worked on? Thanks, Jason On Tue, May 5, 2015 at 6:11 PM, Guozhang Wang wrote: > Does this "log file" acts as a temporary disk buffer when broker slows > down, whose data will be re-sen

Re: Round Robin Partition Assignment

2015-05-05 Thread Jason Rosenberg
I asked about this same issue in a previous thread. Thanks for reminding me, I've added this Jira: https://issues.apache.org/jira/browse/KAFKA-2172 I think this is a great new feature, but it's unfortunately the "all consumers must be the same" is just a bit too restrictive. Jason On Tue, May

Re: 'roundrobin' partition assignment strategy restrictions

2015-05-05 Thread Jason Rosenberg
I filed this jira, fwiw: https://issues.apache.org/jira/browse/KAFKA-2172 Jason On Mon, Mar 23, 2015 at 2:44 PM, Jiangjie Qin wrote: > Hi Jason, > > Yes, I agree the restriction makes the usage of round-robin less flexible. > I think the focus of round-robin strategy is workload balance. If >

Re: circuit breaker for producer

2015-05-05 Thread Guozhang Wang
Does this "log file" acts as a temporary disk buffer when broker slows down, whose data will be re-send to broker later, or do you plan to use it as a separate persistent storage as Kafka brokers? For the former use case, I think there is an open ticket for integrating this kind of functionality i

Round Robin Partition Assignment

2015-05-05 Thread Bryan Baugher
Hi everyone, We recently switched to round robin partition assignment after we noticed that range partition assignment (default) will only make use of the first X consumers were X is the number of partitions for a topic our consumers are interested in. We then noticed the caveat in round robin, "

Re: circuit breaker for producer

2015-05-05 Thread mete
Sure, i kind of count on that actually, i guess with this setting the sender blocks on allocate method and this bufferpool-wait-ratio increases. I want to fully compartmentalize the kafka producer from the rest of the system. Ex: writing to a log file instead of trying to send to kafka when some m

Re: New producer: metadata update problem on 2 Node cluster.

2015-05-05 Thread Rahul Jain
Mayuresh, I was testing this in a development environment and manually brought down a node to simulate this. So the dead node never came back up. My colleague and I were able to consistently see this behaviour several times during the testing. On 5 May 2015 20:32, "Mayuresh Gharat" wrote: > I ag

Re: circuit breaker for producer

2015-05-05 Thread Jay Kreps
Does block.on.buffer.full=false do what you want? -Jay On Tue, May 5, 2015 at 1:59 AM, mete wrote: > Hello Folks, > > I was looking through the kafka.producer metrics on the JMX interface, to > find a good indicator when to "trip" the circuit. So far it seems like the > "bufferpool-wait-ratio"

Re: New producer: metadata update problem on 2 Node cluster.

2015-05-05 Thread Mayuresh Gharat
I agree that to find the least Loaded node the producer should fall back to the bootstrap nodes if its not able to connect to any nodes in the current metadata. That should resolve this. Rahul, I suppose the problem went off because the dead node in your case might have came back up and allowed fo

Re: New producer: metadata update problem on 2 Node cluster.

2015-05-05 Thread Rahul Jain
We observed the exact same error. Not very clear about the root cause although it appears to be related to leastLoadedNode implementation. Interestingly, the problem went away by increasing the value of reconnect.backoff.ms to 1000ms. On 29 Apr 2015 00:32, "Ewen Cheslack-Postava" wrote: > Ok, all

Re: Kafka Cluster Issue

2015-05-05 Thread Kamal C
This is resolved. As I missed host entry configuration in my infrastructure. On Mon, May 4, 2015 at 10:35 AM, Kamal C wrote: > We are running ZooKeeper in ensemble (Cluster of 3 / 5). With further > investigation, I found that the Connect Exception throws for all "inflight" > producers. > > Say

circuit breaker for producer

2015-05-05 Thread mete
Hello Folks, I was looking through the kafka.producer metrics on the JMX interface, to find a good indicator when to "trip" the circuit. So far it seems like the "bufferpool-wait-ratio" metric is a useful decision mechanism when to cut off the production to kafka. As far as i experienced, when ka