Re: Producer message ordering problem

2013-08-25 Thread Ross Black
Hi Jay, Jun, Thanks for your comments - you have confirmed what I thought was most likely the case. I will attempt to work around the issue for the moment in the client to minimise the chance of the out-of-order problem occurring (probably by stopping retries and triggering a fail-fast of the JVM

Re: Producer message ordering problem

2013-08-25 Thread Ross Black
Hi Phillip, Thanks for you input. I did evaluate Storm about 9 months ago before going down the path of developing this myself on top of Kafka. The primary reason for not using Storm was the inability to control allocation of requests to processing elements. This same requirement was the reason

Re: Consumer throughput imbalance

2013-08-25 Thread Neha Narkhede
Making producer side partitioning depend on consumer behavior might not be such a good idea. If consumption is a bottleneck, changing producer side partitioning may not help. To relieve consumption bottleneck, you may need to increase the number of partitions for those topics and increase the numbe

Re: Consumer throughput imbalance

2013-08-25 Thread Ian Friedman
What if you don't know ahead of time how long a message will take to consume? -- Ian Friedman On Sunday, August 25, 2013 at 10:45 AM, Neha Narkhede wrote: > Making producer side partitioning depend on consumer behavior might not be > such a good idea. If consumption is a bottleneck, changing

Re: Consumer throughput imbalance

2013-08-25 Thread Mark
I don't think it would matter as long as you separate the types of message in different topics. Then just add more consumers to the ones that are slow. Am I missing something? On Aug 25, 2013, at 8:59 AM, Ian Friedman wrote: > What if you don't know ahead of time how long a message will take t

Managing consumers/groups

2013-08-25 Thread Mark
I imagine when you start to have dozens of different consumers and multiple consumers in each group this gets really complicated to manage. What are people out there using to manage, start/stop and monitor their consumer groups? Anyway to visualize the grouping and status of each consumer or con

Re: Consumer throughput imbalance

2013-08-25 Thread Ian Friedman
When I said "some messages take longer than others" that may have been misleading. What I meant there is that the performance of the entire application is inconsistent, mostly due to pressure from other applications (mapreduce) on our HBase and MySQL backends. On top of that, some messages just

Re: Consumer throughput imbalance

2013-08-25 Thread Ian Friedman
Sorry I reread what I've written so far and found that it doesn't state the actual problem very well. Let me clarify once again: The problem we're trying to solve is that we can't let messages go for unbounded amounts of time without getting processed, and it seems that something about what we

Re: Consumer throughput imbalance

2013-08-25 Thread Jay Kreps
I'm still a little confused by your description of the problem. It might be easier to understand if you listed out the exact things you have measured, what you saw, and what you expected to see. Since you mentioned the consumer I can give a little info on how that works. The consumer consumes from

Re: Ganglia Metrics Reporter

2013-08-25 Thread Andrew Headrick
FYI.. We wrote a library that is essentially the exact same thing as metrics. The only reason we didn't use metrics was because it didn't exist yet. There is a graphite reporter which could be purposed for metrics. https://github.com/ticketfly/pillage https://github.com/Ticketfly/pillage/blob/mas

Re: Managing consumers/groups

2013-08-25 Thread Jun Rao
To monitor if the consumers are keeping up, take a look at max lag and min fetch jmx beans described in http://kafka.apache.org/documentation.html#monitoring Thanks, Jun On Sun, Aug 25, 2013 at 9:18 AM, Mark wrote: > I imagine when you start to have dozens of different consumers and > multipl

Re: Producer.send questions

2013-08-25 Thread Guozhang Wang
Actually we do have a JIRA tracking this issue: https://issues.apache.org/jira/browse/KAFKA-998 And BTW, any review comments are welcome :) Guozhang On Sat, Aug 24, 2013 at 8:25 PM, Neha Narkhede wrote: > >> Ok, but perhaps the producer will handle something like this in the > future? > > Yes

Re: Consumer throughput imbalance

2013-08-25 Thread Ian Friedman
Jay - is there any way to control the size of the interleaved chunks? The performance hit would likely be negligible for us at the moment. -- Ian Friedman On Sunday, August 25, 2013 at 3:11 PM, Jay Kreps wrote: > I'm still a little confused by your description of the problem. It might be >