Re: Producer.send questions

2013-08-26 Thread Jason Rosenberg
Ok, I added a comment to: https://issues.apache.org/jira/browse/KAFKA-998 I added suggestion about exposing recoverability back to the caller. I'm not 100% sure it's the same issue (since I'm concerned about the client api), and the Jira seems concerned about returning quickly from the internal

RE: questions about ISR

2013-08-26 Thread Yu, Libo
Hi Jun, Could you confirm the following? So after a broker is out of ISR, the only way to let it go back is to restart it. We should set replica.lag.time.max.ms and replica.lag.max.messages as large as possible to avoid a broker fall outside of ISR. What we have experienced is that when a bro

Re: questions about ISR

2013-08-26 Thread Jun Rao
That's right. You shouldn't need to restart the whole cluster for a broker to rejoin ISR. Do you see many ZK session expirations in the brokers (search for "(Expired)"? If so, you may need to tune the GC on the broker. Thanks, Jun On Mon, Aug 26, 2013 at 7:11 AM, Yu, Libo wrote: > Hi Jun, > >

Re: Consumer throughput imbalance

2013-08-26 Thread Jay Kreps
Yeah it is always equal to the fetch size. The fetch size needs to be at least equal to the max message size you have allowed on the server, though. -Jay On Sun, Aug 25, 2013 at 10:00 PM, Ian Friedman wrote: > Jay - is there any way to control the size of the interleaved chunks? The > performa

Re: Consumer throughput imbalance

2013-08-26 Thread Ian Friedman
Just to make sure i have this right, on the producer side we'd set max.message.size and then on the consumer side we'd set fetch.size? I admittedly didn't research how all the tuning options would affect us, thank you for the info. Would queuedchunks.max have any effect? -- Ian Friedman On M

Re: Consumer throughput imbalance

2013-08-26 Thread Ian Friedman
On Sunday, August 25, 2013 at 3:11 PM, Jay Kreps wrote: > I'm still a little confused by your description of the problem. It might be > easier to understand if you listed out the exact things you have measured, > what you saw, and what you expected to see. The problem is that some consumers are sl

Re: Consumer throughput imbalance

2013-08-26 Thread Jay Kreps
Yes exactly. Lowering queuedchunks.max shouldn't help if the problem is what I described. That options controls how many chunks the consumer has ready in memory for processing. But we are hypothesisizing that your problem is actually that the individual chunks are just too large leading to the con

Re: Consumer throughput imbalance

2013-08-26 Thread Ian Friedman
Got it, thanks Jay -- Ian Friedman On Monday, August 26, 2013 at 2:37 PM, Jay Kreps wrote: > Yes exactly. > > Lowering queuedchunks.max shouldn't help if the problem is what I > described. That options controls how many chunks the consumer has ready in > memory for processing. But we are hyp

Re: questions about ISR

2013-08-26 Thread James Wu
Hi Jun, I am curious Yu's questions too. 1. What is the best practice to set replica.lag.time.max.ms & replica.lag.max.messages ? As long as possible or something else ? 2. If the broker exceeds one of these 2 configurations, how should we do to bring the broker back to ISR ? Will controller aut

Re: questions about ISR

2013-08-26 Thread Jun Rao
I added the following in FAQ: https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-HowtoreducechurnsinISR%3F Thanks, Jun On Mon, Aug 26, 2013 at 7:46 PM, James Wu wrote: > Hi Jun, > > I am curious Yu's questions too. > > 1. What is the best practice to set replica.lag.time.max.ms & > rep

docs

2013-08-26 Thread Jay Kreps
Hey All, We've been trying to improve the Kafka docs over the last month. Two things that would really help: 1. If you see something answered that seems like it might be a common question, add it to the FAQ. Right now only a couple people are doing this, but it would be great if everyone did. 2. I

Instances became unresponsive

2013-08-26 Thread Vadim Keylis
Somehow I am getting my instances of kafka to crash. I started kafka instances one by one and they started successfully. Later it some how two of 3 instances became completely unresponsive. The process is running, but connnection over jmx or taking heat dump not possible. The last one some what res