Re: Occasional batch send errors

2013-04-24 Thread Neha Narkhede
It is highly recommended that Kafka and Zookeeper be deployed on different boxes. Also make sure they get dedicated disks, separate from log4j and the OS. Thanks, Neha On Wednesday, April 24, 2013, Karl Kirch wrote: > So switched to sync producer to see what would happen. > I still get the conne

Re: Occasional batch send errors

2013-04-24 Thread Karl Kirch
So switched to sync producer to see what would happen. I still get the connection reset by peer error randomly (I say randomly, but seems to be connected to some zookeeper CancelledKeyExceptions), but unfortunately it throws an error on the message after the one that didn't get sent. Is that t

Re: Occasional batch send errors

2013-04-24 Thread Karl Kirch
So I'm seeing CancelledKeyExceptions cropping up about the time that the connections get reset. Is this a zookeeper error that I'm hitting? Karl On Apr 24, 2013, at 9:55 AM, Karl Kirch wrote: > Just got logging cranked up. Will let you know when I see it again. > > Thanks, > Karl > > On Apr

Re: Occasional batch send errors

2013-04-24 Thread Karl Kirch
Thanks Andrew, I'm not seeing the event queue exception but, I'm running my cluster on a set of virtual machines which share the same physical hardware (I know, exactly what I'm not supposed to do) and I'm getting some slow fsync zookeeper warnings in my logs. I imagine that my broker writes a

Re: Occasional batch send errors

2013-04-24 Thread Karl Kirch
Just got logging cranked up. Will let you know when I see it again. Thanks, Karl On Apr 23, 2013, at 8:11 PM, Jun Rao wrote: > This means that the broker closed the socket connection for some reason. > The broker log around the same time should show the reason. Could you dig > that out? > > T

Re: Occasional batch send errors

2013-04-23 Thread Jun Rao
This means that the broker closed the socket connection for some reason. The broker log around the same time should show the reason. Could you dig that out? Thanks, Jun On Tue, Apr 23, 2013 at 3:35 PM, Karl Kirch wrote: > I occasionally am getting some batch send errors from the stock async >

Re: Occasional batch send errors

2013-04-23 Thread Karl Kirch
I really need the speed of the async producer (unless the sync producer is able to get up in the 100k/sec range...) so the sync producer is going to be a tough sell. I've also double checked my config settings and they're good. I did notice some slow fsync warnings in the Kafka broker logs tho

Re: Occasional batch send errors

2013-04-23 Thread Xavier Stevens
Usually when these types of errors are because you're not connecting to the proper host:port. Double check your configs, make sure everything is running and listening on the host:port you think they are. Have you tried using the sync producer to work out your bugs? My guess is the sync producer wo

Re: Occasional batch send errors

2013-04-23 Thread Andrew Neilson
Hey Karl, I have a very similar setup (3 kafka 0.7.2 brokers, 3 ZK 3.4.3 nodes) that I'm running right now and am getting the same error on the producers. Haven't resolved it yet: ERROR ProducerSendThread--1585663279 kafka.producer.async.ProducerSendThread - Error in handling batch of 200 events j

Re: Occasional batch send errors

2013-04-23 Thread Karl Kirch
Hmmm… that didn't seem to help. Anyone else see this sort of errors? Karl On Apr 23, 2013, at 5:58 PM, Karl Kirch wrote: > I'm going to try bumping up the "numRetries" key in my producer config. > Is this a good option in this case? > I am using the zookeeper connect option so I'm aware tha

Re: Occasional batch send errors

2013-04-23 Thread Karl Kirch
I'm going to try bumping up the "numRetries" key in my producer config. Is this a good option in this case? I am using the zookeeper connect option so I'm aware that I may get stuck retrying to a failed node, but if it's just a temporary network glitch I'll at least get a bit more of a chance t

Occasional batch send errors

2013-04-23 Thread Karl Kirch
I occasionally am getting some batch send errors from the stock async producer. This is on a cluster of 3 kafka (0.7.2) and 3 zookeeper nodes. Is there anyway to check what happens when those batch errors occur? Or bump up the retry count? (looks like it only did a single retry). I need the spe