[ https://issues.apache.org/jira/browse/KAFKA-600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jay Kreps resolved KAFKA-600. ----------------------------- Resolution: Fixed Since there don't seem to be any objections I am closing out this issue. > kafka should respond gracefully rather than crash when unable to write due to > ENOSPC > ------------------------------------------------------------------------------------ > > Key: KAFKA-600 > URL: https://issues.apache.org/jira/browse/KAFKA-600 > Project: Kafka > Issue Type: Bug > Components: core > Reporter: Tim Stilwell > > problem: > user starts kafka with log.dir value set to a small partition and begins > writing data to the mq. when the disk partition is full, kafka crashes. > given that this product is used for both reading and writing operations, > crashing seems rather drastic even if the error message is helpful. > something more robust would be appreciated. perhaps, logging an error and > rejecting additional write requests while accepting additional read requests? > perhaps, sending an email alert to Operations? at least shutdown gracefully > so the user is aware that received messages were saved with a helpful message > providing some details of the last message received. when tens or hundreds > of thousands of messages can be processed in a second, it isn't helpful to > merely log a timestamp and crash. > steps to reproduce: > 1) download and install kafka > 2) modify server.properties > # vi /opt/kafka-0.7.2-incubating-src/config/server.properties > set log.dir="/var/log/kafka" > 3) modify log4j > # vi /opt/kafka-0.7.2-incubating-src/config/log4j.properties > set fileAppender.File=/var/log/kafka/kafka-request.log > 4) start kafka service > $ sudo bash > # ulimit -c unlimited > # /opt/kafka-0.7.2-incubating-src/bin/kafka-server-start.sh > /opt/kafka-0.7.2-incubating-src/config/server.properties & > 6) begin writing data to hostname:9092 > 7) review /var/log/kafka-request.log > results: > $ grep log.dir /opt/kafka-0.7.2-incubating-src/config/server.properties > log.dir=/var/log/kafka > $ df -h /var/log/kafka > Filesystem Size Used Avail Use% Mounted on > /dev/sda1 4.0G 4.0G 0 100% / > $ tail /var/log/kafka/kafka-request.log > 17627442 [ZkClient-EventThread-14-10.0.20.242:2181] INFO > kafka.server.KafkaZooKeeper - Begin registering broker topic > /brokers/topics/raw/0 with 1 partitions > 17627444 [ZkClient-EventThread-14-10.0.20.242:2181] INFO > kafka.server.KafkaZooKeeper - End registering broker topic > /brokers/topics/raw/0 > 17627445 [ZkClient-EventThread-14-10.0.20.242:2181] INFO > kafka.server.KafkaZooKeeper - done re-registering broker > 18337676 [kafka-processor-3] ERROR kafka.network.Processor - Closing socket > for /10.0.20.138 because of error > java.io.IOException: Connection reset by peer > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:218) > at sun.nio.ch.IOUtil.read(IOUtil.java:191) > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:359) > at kafka.utils.Utils$.read(Utils.scala:538) > at > kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54) > at kafka.network.Processor.read(SocketServer.scala:311) > at kafka.network.Processor.run(SocketServer.scala:214) > at java.lang.Thread.run(Thread.java:722) > 18391974 [kafka-processor-4] INFO kafka.network.Processor - Closing socket > connection to /10.0.20.138. > 18422004 [kafka-processor-5] INFO kafka.network.Processor - Closing socket > connection to /10.0.20.138. > 18434563 [kafka-processor-6] INFO kafka.network.Processor - Closing socket > connection to /10.0.20.138. > 18485005 [kafka-processor-7] INFO kafka.network.Processor - Closing socket > connection to /10.0.20.138. > 18497083 [kafka-processor-0] INFO kafka.network.Processor - Closing socket > connection to /10.0.20.138. > 18525720 [kafka-processor-1] INFO kafka.network.Processor - Closing socket > connection to /10.0.20.138. > 18543843 [kafka-processor-2] INFO kafka.network.Processor - Closing socket > connection to /10.0.20.138. > 18563230 [kafka-processor-4] INFO kafka.network.Processor - Closing socket > connection to /10.0.20.138. > 18575613 [kafka-processor-5] INFO kafka.network.Processor - Closing socket > connection to /10.0.20.138. > 18677568 [kafka-processor-6] ERROR kafka.network.Processor - Closing socket > for /10.0.20.138 because of error > java.io.IOException: Connection reset by peer > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:218) > at sun.nio.ch.IOUtil.read(IOUtil.java:191) > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:359) > at kafka.utils.Utils$.read(Utils.scala:538) > at > kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54) > at kafka.network.Processor.read(SocketServer.scala:311) > at kafka.network.Processor.run(SocketServer.scala:214) > at java.lang.Thread.run(Thread.java:722) > 18828016 [kafka-processor-7] INFO kafka.network.Processor - Closing socket > connection to /10.0.20.248. > 18844274 [kafka-processor-0] INFO kafka.network.Processor - Closing socket > connection to /10.0.20.248. > 18849691 [kafka-processor-1] INFO kafka.network.Processor - Closing socket > connection to /10.0.20.248. > 18896883 [kafka-processor-2] INFO kafka.network.Processor - Closing socket > connection to /10.0.20.248. > 22383195 [kafka-processor-2] FATAL kafka.log.Log - Halting due to > unrecoverable I/O error while handling producer request > java.io.IOException: No space left on device > at sun.nio.ch.FileDispatcherImpl.write0(Native Method) > at sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:59) > at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:89) > at sun.nio.ch.IOUtil.write(IOUtil.java:60) > at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:195) > at > kafka.message.ByteBufferMessageSet.writeTo(ByteBufferMessageSet.scala:76) > at kafka.message.FileMessageSet.append(FileMessageSet.scala:159) > at kafka.log.LogSegment.append(Log.scala:105) > at kafka.log.Log.liftedTree1$1(Log.scala:246) > at kafka.log.Log.append(Log.scala:242) > at > kafka.server.KafkaRequestHandlers.kafka$server$KafkaRequestHandlers$$handleProducerRequest(KafkaRequestHandlers.scala:69) > at > kafka.server.KafkaRequestHandlers.handleProducerRequest(KafkaRequestHandlers.scala:53) > at > kafka.server.KafkaRequestHandlers$$anonfun$handlerFor$1.apply(KafkaRequestHandlers.scala:38) > at > kafka.server.KafkaRequestHandlers$$anonfun$handlerFor$1.apply(KafkaRequestHandlers.scala:38) > at kafka.network.Processor.handle(SocketServer.scala:296) > at kafka.network.Processor.read(SocketServer.scala:319) > at kafka.network.Processor.run(SocketServer.scala:214) > at java.lang.Thread.run(Thread.java:722) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira