[ https://issues.apache.org/jira/browse/KAFKA-2078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681680#comment-14681680 ]
PC commented on KAFKA-2078: --------------------------- I can reproduce this bug though it appears to be a challenge to do so. Running on Mac OS X 10.9.5 16GB Ram Java version 1.8.0_40 It only appears to affect the Producer; org.apache.kafka.clients.producer.KafkaProducer 0.8.2.1 Setup: 3 Producers pumping test data to one kafka-server, with 1 replica, all running locally on the same machine. Each producer using the async .send(producerRecord, callBack) method. The configs will be at the bottom of this post. Here is a log snippet: 16:21:51.527 [message-consumer-akka.actor.default-dispatcher-5] DEBUG producer - PumpSuccess topic: test partition 0 offset: 3330477 16:21:51.528 [message-consumer-akka.actor.default-dispatcher-5] DEBUG producer - PumpSuccess topic: test partition 0 offset: 3330478 16:21:51.528 [message-consumer-akka.actor.default-dispatcher-5] DEBUG producer - PumpSuccess topic: test partition 0 offset: 3330479 16:21:51.528 [message-consumer-akka.actor.default-dispatcher-5] DEBUG producer - PumpSuccess topic: test partition 0 offset: 3330480 16:21:51.528 [message-consumer-akka.actor.default-dispatcher-5] DEBUG producer - PumpSuccess topic: test partition 0 offset: 3330481 16:26:26.220 [kafka-producer-network-thread | producer-3] WARN o.a.kafka.common.network.Selector - Error in I/O with localhost/127.0.0.1 java.io.EOFException: null at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:62) ~[kafka-clients-0.8.2.1.jar:na] at org.apache.kafka.common.network.Selector.poll(Selector.java:248) ~[kafka-clients-0.8.2.1.jar:na] at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:192) [kafka-clients-0.8.2.1.jar:na] at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191) [kafka-clients-0.8.2.1.jar:na] at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122) [kafka-clients-0.8.2.1.jar:na] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40] 16:26:26.220 [kafka-producer-network-thread | producer-2] WARN o.a.kafka.common.network.Selector - Error in I/O with localhost/127.0.0.1 java.io.EOFException: null at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:62) ~[kafka-clients-0.8.2.1.jar:na] at org.apache.kafka.common.network.Selector.poll(Selector.java:248) ~[kafka-clients-0.8.2.1.jar:na] at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:192) [kafka-clients-0.8.2.1.jar:na] at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191) [kafka-clients-0.8.2.1.jar:na] at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122) [kafka-clients-0.8.2.1.jar:na] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40] 16:26:26.220 [kafka-producer-network-thread | producer-1] WARN o.a.kafka.common.network.Selector - Error in I/O with localhost/127.0.0.1 java.io.EOFException: null at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:62) ~[kafka-clients-0.8.2.1.jar:na] at org.apache.kafka.common.network.Selector.poll(Selector.java:248) ~[kafka-clients-0.8.2.1.jar:na] at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:192) [kafka-clients-0.8.2.1.jar:na] at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191) [kafka-clients-0.8.2.1.jar:na] at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122) [kafka-clients-0.8.2.1.jar:na] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40] Pay attention to the timestamps. Less than 5 minutes after the producers were FINISHED pumping the data, these 3 exceptions were logged by the kafka-producer internals. The worst is, this bug also occurred while pumping messages to the broker, 2 days ago. The CallBack code was not called for 3 messages ( 1 per producer ) when this bug kicked-in nor was an exception thrown in my application. This can potentially lead to serious data loss and has severe implications. I would in a heartbeat upgrade this bug as SEVERE/CRITICAL and not Major. Temporary (unacceptable) solution is to block with a timeout to ensure we didn't lose data when this bug manifests itself: try { .... kafkaProducer.send(record, callBack).get(5, TimeUnit.SECONDS) } catch { .... } This approach reduces the pumping throughput down to roughly ~5k messages/sec, from ~60k messages/sec using the async, for a single producer. Config properties: Kafka-Server: broker.id=0 port=9092 num.network.threads=3 num.io.threads=8 socket.send.buffer.bytes=102400 socket.receive.buffer.bytes=102400 socket.request.max.bytes=104857600 log.dirs=/tmp/kafka-logs num.partitions=1 num.recovery.threads.per.data.dir=1 #log.flush.interval.messages=10000 log.flush.interval.ms=5000 delete.topic.enable=true log.retention.hours=2147483640 log.segment.bytes=1073741824 log.retention.check.interval.ms=30000000 log.cleaner.enable=false zookeeper.connect=localhost:2181 zookeeper.connection.timeout.ms=12000 offsets.topic.retention.minutes=28800 offset.metadata.max.bytes=4096 offsets.topic.num.partitions=50 offsets.retention.check.interval.ms=600000 offsets.topic.replication.factor=3 offsets.topic.segment.bytes=104857600 offsets.load.buffer.size=5242880 offsets.commit.required.acks=-1 offsets.commit.timeout.ms=5000 default.replication.factor=1 num.partitions=1 auto.create.topics.enable=true unclean.leader.election.enable=false Zookeeper: dataDir=/tmp/zookeeper clientPort=2181 maxClientCnxns=0 Producer: kafkaProducerProps.put(ProducerConfig.ACKS_CONFIG, "1") kafkaProducerProps.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "127.0.0.1:9092") kafkaProducerProps.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, classOf[StringSerializer].getName) kafkaProducerProps.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, classOf[StringSerializer].getName) Is it possible for anyone to seriously look into this problem? It really does exist. > Getting Selector [WARN] Error in I/O with host java.io.EOFException > ------------------------------------------------------------------- > > Key: KAFKA-2078 > URL: https://issues.apache.org/jira/browse/KAFKA-2078 > Project: Kafka > Issue Type: Bug > Components: producer > Affects Versions: 0.8.2.0 > Environment: OS Version: 2.6.39-400.209.1.el5uek and Hardware: 8 x > Intel(R) Xeon(R) CPU X5660 @ 2.80GHz/44GB > Reporter: Aravind > Assignee: Jun Rao > > When trying to Produce 1000 (10 MB) messages, getting this below error some > where between 997 to 1000th message. There is no pattern but able to > reproduce. > [PDT] 2015-03-31 13:53:50 Selector [WARN] Error in I/O with "our host" > java.io.EOFException at > org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:62) > at org.apache.kafka.common.network.Selector.poll(Selector.java:248) at > org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:192) at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191) at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122) at > java.lang.Thread.run(Thread.java:724) > This error I am getting some times @ 997th message or 999th message. There is > no pattern but able to reproduce. -- This message was sent by Atlassian JIRA (v6.3.4#6332)