We have been debating the granularity of logging. Here is an example where I just wasted a little time. I was testing a three node cluster. I killed one of the brokers and then I saw this in the other brokers logs:
java.nio.channels.ClosedChannelException at kafka.network.BlockingChannel.send(BlockingChannel.scala:97) at kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:78) at kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:68) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:107) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:107) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:107) at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(SimpleConsumer.scala:106) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:106) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:106) at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:105) at kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:96) at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:88) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51) [2014-03-04 21:06:38,957] WARN Reconnect due to socket error: null (kafka.consumer.SimpleConsumer) [2014-03-04 21:06:39,021] INFO [ReplicaFetcherManager on broker 2] Removed fetcher for partitions [my-replicated-topic,0] (kafka.server.ReplicaFetcherManager) [2014-03-04 21:06:39,022] INFO [ReplicaFetcherThread-0-1], Shutting down (kafka.server.ReplicaFetcherThread) [2014-03-04 21:06:41,556] INFO [ReplicaFetcherThread-0-1], Stopped (kafka.server.ReplicaFetcherThread) [2014-03-04 21:06:41,556] INFO [ReplicaFetcherThread-0-1], Shutdown completed (kafka.server.ReplicaFetcherThread) Oh no my other brokers are shutting themselves down! What is going on! No actually this is just us devs unhelping logging the shutting down of some internal thread I (the humble user) couldn't possibly know anything about. When normal people interact with a system like this they really find it unfriendly and hard to use. When implementing logging please please put yourself in the shoes of the person using the system. They actually don't want to see every detail of your internal code that you might be interested in. They want to see relevant events that they need to know about. What I needed to see was just a warning about being unable to communicate with the other broker. A stack trace isn't helpful. These shutdown messages aren't helpful. -Jay