[ 
https://issues.apache.org/jira/browse/KAFKA-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15196471#comment-15196471
 ] 

Mart Haitjema commented on KAFKA-3205:
--------------------------------------

I also ran into this issue and discovered that the broker closes connections 
that have been idle for connections.max.idle.ms 
(https://kafka.apache.org/090/configuration.html#brokerconfigs) which has a 
default of 10 minutes.
While this parameter was introduced in 0.8.2 
(https://kafka.apache.org/082/configuration.html#brokerconfigs) it wasn't 
actually enforced by the broker until 0.9.0 which closes the connections inside 
Selector.java::maybeCloseOldestConnection()
(see 
https://github.com/apache/kafka/commit/78ba492e3e70fd9db61bc82469371d04a8d6b762#diff-d71b50516bd2143d208c14563842390a).
While the producer config also defines this parameter with a default of 9 
minutes, it does not appear to be respected by the 0.8.2.x clients which mean 
idle connections aren't being closed on the client-side but are timed out by 
the broker.
When the broker drops the connection, it results in an java.io.EOFException: 
null exception on the producer-side that looks exactly like the one shown in 
the description.

To work around this issue, we explicitly set the connections.max.idle.ms to 
something very large in the broker config  (e.g. 1 year) which seems to have 
mitigated the problem for us.


> Error in I/O with host (java.io.EOFException) raised in producer
> ----------------------------------------------------------------
>
>                 Key: KAFKA-3205
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3205
>             Project: Kafka
>          Issue Type: Bug
>          Components: clients
>    Affects Versions: 0.8.2.1, 0.9.0.0
>            Reporter: Jonathan Raffre
>
> In a situation with a Kafka broker in 0.9 and producers still in 0.8.2.x, 
> producers seems to raise the following after a variable amount of time since 
> start :
> {noformat}
> 2016-01-29 14:33:13,066 WARN [] o.a.k.c.n.Selector: Error in I/O with 
> 172.22.2.170
> java.io.EOFException: null
>         at 
> org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:62)
>  ~[org.apache.kafka.kafka-clients-0.8.2.0.jar:na]
>         at org.apache.kafka.common.network.Selector.poll(Selector.java:248) 
> ~[org.apache.kafka.kafka-clients-0.8.2.0.jar:na]
>         at 
> org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:192) 
> [org.apache.kafka.kafka-clients-0.8.2.0.jar:na]
>         at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191) 
> [org.apache.kafka.kafka-clients-0.8.2.0.jar:na]
>         at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122) 
> [org.apache.kafka.kafka-clients-0.8.2.0.jar:na]
>         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66-internal]
> {noformat}
> This can be reproduced successfully by doing the following :
>  * Start a 0.8.2 producer connected to the 0.9 broker
>  * Wait 15 minutes, exactly
>  * See the error in the producer logs.
> Oddly, this also shows up in an active producer but after 10 minutes of 
> activity.
> Kafka's server.properties :
> {noformat}
> broker.id=1
> listeners=PLAINTEXT://:9092
> port=9092
> num.network.threads=2
> num.io.threads=2
> socket.send.buffer.bytes=1048576
> socket.receive.buffer.bytes=1048576
> socket.request.max.bytes=104857600
> log.dirs=/mnt/data/kafka
> num.partitions=4
> auto.create.topics.enable=false
> delete.topic.enable=true
> num.recovery.threads.per.data.dir=1
> log.retention.hours=48
> log.retention.bytes=524288000
> log.segment.bytes=52428800
> log.retention.check.interval.ms=60000
> log.roll.hours=24
> log.cleanup.policy=delete
> log.cleaner.enable=true
> zookeeper.connect=127.0.0.1:2181
> zookeeper.connection.timeout.ms=1000000
> {noformat}
> Producer's configuration :
> {noformat}
>       compression.type = none
>       metric.reporters = []
>       metadata.max.age.ms = 300000
>       metadata.fetch.timeout.ms = 60000
>       acks = all
>       batch.size = 16384
>       reconnect.backoff.ms = 10
>       bootstrap.servers = [127.0.0.1:9092]
>       receive.buffer.bytes = 32768
>       retry.backoff.ms = 500
>       buffer.memory = 33554432
>       timeout.ms = 30000
>       key.serializer = class 
> org.apache.kafka.common.serialization.StringSerializer
>       retries = 3
>       max.request.size = 5000000
>       block.on.buffer.full = true
>       value.serializer = class 
> org.apache.kafka.common.serialization.StringSerializer
>       metrics.sample.window.ms = 30000
>       send.buffer.bytes = 131072
>       max.in.flight.requests.per.connection = 5
>       metrics.num.samples = 2
>       linger.ms = 0
>       client.id = 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to