[
https://issues.apache.org/jira/browse/KAFKA-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15196471#comment-15196471
]
Mart Haitjema commented on KAFKA-3205:
--------------------------------------
I also ran into this issue and discovered that the broker closes connections
that have been idle for connections.max.idle.ms
(https://kafka.apache.org/090/configuration.html#brokerconfigs) which has a
default of 10 minutes.
While this parameter was introduced in 0.8.2
(https://kafka.apache.org/082/configuration.html#brokerconfigs) it wasn't
actually enforced by the broker until 0.9.0 which closes the connections inside
Selector.java::maybeCloseOldestConnection()
(see
https://github.com/apache/kafka/commit/78ba492e3e70fd9db61bc82469371d04a8d6b762#diff-d71b50516bd2143d208c14563842390a).
While the producer config also defines this parameter with a default of 9
minutes, it does not appear to be respected by the 0.8.2.x clients which mean
idle connections aren't being closed on the client-side but are timed out by
the broker.
When the broker drops the connection, it results in an java.io.EOFException:
null exception on the producer-side that looks exactly like the one shown in
the description.
To work around this issue, we explicitly set the connections.max.idle.ms to
something very large in the broker config (e.g. 1 year) which seems to have
mitigated the problem for us.
> Error in I/O with host (java.io.EOFException) raised in producer
> ----------------------------------------------------------------
>
> Key: KAFKA-3205
> URL: https://issues.apache.org/jira/browse/KAFKA-3205
> Project: Kafka
> Issue Type: Bug
> Components: clients
> Affects Versions: 0.8.2.1, 0.9.0.0
> Reporter: Jonathan Raffre
>
> In a situation with a Kafka broker in 0.9 and producers still in 0.8.2.x,
> producers seems to raise the following after a variable amount of time since
> start :
> {noformat}
> 2016-01-29 14:33:13,066 WARN [] o.a.k.c.n.Selector: Error in I/O with
> 172.22.2.170
> java.io.EOFException: null
> at
> org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:62)
> ~[org.apache.kafka.kafka-clients-0.8.2.0.jar:na]
> at org.apache.kafka.common.network.Selector.poll(Selector.java:248)
> ~[org.apache.kafka.kafka-clients-0.8.2.0.jar:na]
> at
> org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:192)
> [org.apache.kafka.kafka-clients-0.8.2.0.jar:na]
> at
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191)
> [org.apache.kafka.kafka-clients-0.8.2.0.jar:na]
> at
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122)
> [org.apache.kafka.kafka-clients-0.8.2.0.jar:na]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66-internal]
> {noformat}
> This can be reproduced successfully by doing the following :
> * Start a 0.8.2 producer connected to the 0.9 broker
> * Wait 15 minutes, exactly
> * See the error in the producer logs.
> Oddly, this also shows up in an active producer but after 10 minutes of
> activity.
> Kafka's server.properties :
> {noformat}
> broker.id=1
> listeners=PLAINTEXT://:9092
> port=9092
> num.network.threads=2
> num.io.threads=2
> socket.send.buffer.bytes=1048576
> socket.receive.buffer.bytes=1048576
> socket.request.max.bytes=104857600
> log.dirs=/mnt/data/kafka
> num.partitions=4
> auto.create.topics.enable=false
> delete.topic.enable=true
> num.recovery.threads.per.data.dir=1
> log.retention.hours=48
> log.retention.bytes=524288000
> log.segment.bytes=52428800
> log.retention.check.interval.ms=60000
> log.roll.hours=24
> log.cleanup.policy=delete
> log.cleaner.enable=true
> zookeeper.connect=127.0.0.1:2181
> zookeeper.connection.timeout.ms=1000000
> {noformat}
> Producer's configuration :
> {noformat}
> compression.type = none
> metric.reporters = []
> metadata.max.age.ms = 300000
> metadata.fetch.timeout.ms = 60000
> acks = all
> batch.size = 16384
> reconnect.backoff.ms = 10
> bootstrap.servers = [127.0.0.1:9092]
> receive.buffer.bytes = 32768
> retry.backoff.ms = 500
> buffer.memory = 33554432
> timeout.ms = 30000
> key.serializer = class
> org.apache.kafka.common.serialization.StringSerializer
> retries = 3
> max.request.size = 5000000
> block.on.buffer.full = true
> value.serializer = class
> org.apache.kafka.common.serialization.StringSerializer
> metrics.sample.window.ms = 30000
> send.buffer.bytes = 131072
> max.in.flight.requests.per.connection = 5
> metrics.num.samples = 2
> linger.ms = 0
> client.id =
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)