What replication factor are you using? Currently if a partition is offline, the message in producer will not be sent but sit in accumulator until the partition comes back online. Do you mean you want to use the message send callback to detect broker failure?
Jiangjie (Becket) Qin On 6/8/15, 12:27 AM, "ankit tyagi" <ankittyagi.mn...@gmail.com> wrote: >Hi, > >we are using .8.2.0 broker and default async producer to send the >message*. >we recently found out that if whole cluster gets down then callback >handler >is not getting called while we are getting below exception continuously* > > > >*12:36:41,267# WARN [Selector] - Error in I/O with localhost/127.0.0.1 ><http://127.0.0.1>java.net.ConnectException: Connection refusedat >sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)* >at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) >at org.apache.kafka.common.network.Selector.poll(Selector.java:238) >at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:192) >at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191) >at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122) >at java.lang.Thread.run(Thread.java:745) > >I want to know is there any way I can catch to figure out cluster is >down problematically and log all the failed message so that it can be >retry later after cluster state changes.