Benjamin Vetter created KAFKA-3399: -------------------------------------- Summary: Offset/logfile corruption and kafka closing connections Key: KAFKA-3399 URL: https://issues.apache.org/jira/browse/KAFKA-3399 Project: Kafka Issue Type: Bug Affects Versions: 0.9.0.1 Reporter: Benjamin Vetter
hi, it seems kafka created some kind of gap and corruption within the logfiles. Seems it happened in between rotation. The gap is of course no problem, but the "corruption" is, because kafka is closing connections when i try to fetch messages around offset 73581, such that my worker can't proceed. offset 73580 works: {code} irb(main):011:0> client.fetch_messages(:topic => "my-topic", :offset => 73580, :partition => 0, :max_wait_time => 10).last => #<Kafka::FetchedMessage:0x000000077a0b28 ...> irb(main):003:0> client.fetch_messages(:topic => "my-topic", :offset => 73581, :partition => 0, :max_wait_time => 10).last Kafka::ConnectionError: Connection error: EOFError irb(main):004:0> client.fetch_messages(:topic => "my-topic", :offset => 73582, :partition => 0, :max_wait_time => 10).last Kafka::ConnectionError: Connection error: EOFError ... irb(main):007:0> client.fetch_messages(:topic => "my-topic", :offset => 73641, :partition => 0, :max_wait_time => 10).first Kafka::ConnectionError: Connection error: EOFError irb(main):005:0> client.fetch_messages(:topic => "my-topic", :offset => 73642, :partition => 0, :max_wait_time => 10).last => #<Kafka::FetchedMessage:0x000000072a2a40 ...> {code} The EOFError is happening as kafka is closing the connection. Starting from 73642 fetching messages works again. Unsurprisingly, the respective logfiles look like: kafka kafka 12256 Mär 15 07:54 00000000000000000000.index kafka kafka 6392949 Mär 15 07:54 00000000000000000000.log kafka kafka 10485760 Mär 15 07:54 00000000000000073581.index kafka kafka 1520909 Mär 15 09:02 00000000000000073581.log -- This message was sent by Atlassian JIRA (v6.3.4#6332)