[
https://issues.apache.org/jira/browse/KAFKA-1804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279396#comment-14279396
]
Alexey Ozeritskiy commented on KAFKA-1804:
------------------------------------------
We've written the simple patch for kafka-network-thread:
{code:java}
override def run(): Unit = {
try {
original_run()
} catch {
case e: Throwable =>
error("ERROR IN NETWORK THREAD: %s".format(e), e)
Runtime.getRuntime.halt(1)
}
}
{code}
and got the following trace:
{code}
[2015-01-15 23:04:08,537] ERROR ERROR IN NETWORK THREAD:
java.util.NoSuchElementException: None.get (kafka.network.Processor)
java.util.NoSuchElementException: None.get
at scala.None$.get(Option.scala:313)
at scala.None$.get(Option.scala:311)
at kafka.network.ConnectionQuotas.dec(SocketServer.scala:544)
at kafka.network.AbstractServerThread.close(SocketServer.scala:165)
at kafka.network.AbstractServerThread.close(SocketServer.scala:157)
at kafka.network.Processor.close(SocketServer.scala:394)
at kafka.network.Processor.processNewResponses(SocketServer.scala:426)
at kafka.network.Processor.iteration(SocketServer.scala:328)
at kafka.network.Processor.run(SocketServer.scala:381)
at java.lang.Thread.run(Thread.java:745)
{code}
> Kafka network thread lacks top exception handler
> ------------------------------------------------
>
> Key: KAFKA-1804
> URL: https://issues.apache.org/jira/browse/KAFKA-1804
> Project: Kafka
> Issue Type: Bug
> Reporter: Oleg Golovin
>
> We have faced the problem that some kafka network threads may fail, so that
> jstack attached to Kafka process showed fewer threads than we had defined in
> our Kafka configuration. This leads to API requests processed by this thread
> getting stuck unresponed.
> There were no error messages in the log regarding thread failure.
> We have examined Kafka code to find out there is no top try-catch block in
> the network thread code, which could at least log possible errors.
> Could you add top-level try-catch block for the network thread, which should
> recover network thread in case of exception?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)