We are running kafka 0.9.0.1 in production and saw these exceptions:

[2016-06-23 22:55:10,239] INFO [KafkaApi-3] Closing connection due to error 
during produce request with correlation id 6 from client id kafka-python with 
ack=0


Topic and partition to exceptions: [xyx,8] -> 
kafka.common.MessageSizeTooLargeException (kafka.server.KafkaApis)


[2016-06-23 22:55:41,917] INFO Scheduling log segment 95455988 for log 
abc_json-7 for deletion. (kafka.log.Log)


[2016-06-23 22:55:41,921] INFO Scheduling log segment 2036034857 for log 
xyz_json-3 for deletion. (kafka.log.Log)


[2016-06-23 22:55:51,112] INFO Rolled new log segment for 'abc_json-7' in 1 ms. 
(kafka.log.Log)


[2016-06-23 22:55:59,411] ERROR Processor got uncaught exception. 
(kafka.network.Processor)


java.lang.OutOfMemoryError: Direct buffer memory


        at java.nio.Bits.reserveMemory(Bits.java:658)


        at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123)


        at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306)


        at sun.nio.ch.Util.getTemporaryDirectBuffer(Util.java:174)


        at sun.nio.ch.IOUtil.read(IOUtil.java:195)


        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)


        at 
org.apache.kafka.common.network.PlaintextTransportLayer.read(PlaintextTransportLayer.java:108)


        at 
org.apache.kafka.common.network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:97)


        at 
org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:71)


        at 
org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:153)


        at 
org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:134)


        at org.apache.kafka.common.network.Selector.poll(Selector.java:286)


        at kafka.network.Processor.run(SocketServer.scala:413)


        at java.lang.Thread.run(Thread.java:745)


2 of our brokers were affected and caused them to slowdown.

Prior to seeing these exceptions, we did partition reassignments on the cluster 
and observed a steep decrease in cache usage on the nodes.


Not quite sure of the exact cause of the "Direct buffer memory" exceptions, is 
it simply the case that kafka is receiving too large messages that it cant hold 
in memory?


Thanks,

Joseph

Reply via email to