I now have the following settings (in various configs): In our producer configs:
producer.request.timeout.ms=600000 This producer just hangs there for 10 minutes before timing out Here is the stack dump for that timeout: java.net.SocketTimeoutException at sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:201) at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:86) at java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:221) at kafka.utils.Utils$.read(Utils.scala:372) at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54) at kafka.network.Receive$class.readCompletely(Transmission.scala:56) at kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29) at kafka.network.BlockingChannel.receive(BlockingChannel.scala:100) at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:73) at kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:71) at kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SyncProducer.scala:98) at kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:98) at kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:98) at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) at kafka.producer.SyncProducer$$anonfun$send$1.apply$mcV$sp(SyncProducer.scala:97) at kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:97) at kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:97) at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) at kafka.producer.SyncProducer.send(SyncProducer.scala:96) at kafka.producer.async.DefaultEventHandler.kafka$producer$async$DefaultEventHandler$$send(DefaultEventHandler.scala:221) at kafka.producer.async.DefaultEventHandler$$anonfun$dispatchSerializedData$1.apply(DefaultEventHandler.scala:91) at kafka.producer.async.DefaultEventHandler$$anonfun$dispatchSerializedData$1.apply(DefaultEventHandler.scala:85) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:80) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:80) at scala.collection.Iterator$class.foreach(Iterator.scala:631) at scala.collection.mutable.HashTable$$anon$1.foreach(HashTable.scala:161) at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:194) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) at scala.collection.mutable.HashMap.foreach(HashMap.scala:80) at kafka.producer.async.DefaultEventHandler.dispatchSerializedData(DefaultEventHandler.scala:85) at kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:61) at kafka.producer.Producer.send(Producer.scala:76) at kafka.javaapi.producer.Producer.send(Producer.scala:41) at com.visibletechnologies.platform.common.kafka.KafkaWriter.flush(KafkaWriter.java:131) at com.visibletechnologies.platform.ingestion.ContentWriter.flushToKafka(ContentWriter.java:394) at com.visibletechnologies.platform.ingestion.Midas.processPosts(Midas.java:430) at com.visibletechnologies.platform.ingestion.Midas.doWork(Midas.java:194) at com.visibletechnologies.framework.servicebase.ServiceBase.start(ServiceBase.java:187) at com.visibletechnologies.platform.ingestion.Main.main(Main.java:413) 2013-03-29 22:31:58,633 INFO kafka.client.ClientUtils$: Fetching metadata for topic Set(VTFull-enriched) Kafka server.properties (both brokers): replica.socket.timeout.ms=200000 controller.socket.timeout.ms=200000 zk.connectiontimeout.ms=1000000 After seeing these issues last week, the problems magically vanished, but yesterday they came back and we are now seeing consistent SocketTimeoutExceptions both in the producers and in the brokers We never see these issues where we have only one broker running. Thanks, Bob Jervis Visible Technologies