[ 
https://issues.apache.org/jira/browse/KAFKA-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15945156#comment-15945156
 ] 

Ralph Weires commented on KAFKA-3039:
-------------------------------------

Experienced the same problem on v0.10.2.0, during update of brokers from 
0.8.2.1 to that version. One of the brokers truncated almost all of its data on 
startup, which was the first startup on the new 0.10.2.0 version as part of a 
rolling restart of the cluster. Luckily, it was the only broker who behaved 
like this...

Log snippet from the affected broker, for a sample partition:
[2017-03-28 09:42:01,614] INFO Completed load of log entries-112 with 12 log 
segments and log end offset 354903677 in 14 ms (kafka.log.Log)
[2017-03-28 09:42:42,025] INFO Partition [entries,112] on broker 19: No 
checkpointed highwatermark is found for partition entries-112 
(kafka.cluster.Partition)
[2017-03-28 09:42:48,031] INFO Truncating log entries-112 to offset 0. 
(kafka.log.Log)


> Temporary loss of leader resulted in log being completely truncated
> -------------------------------------------------------------------
>
>                 Key: KAFKA-3039
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3039
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.9.0.0
>         Environment: Debian 3.2.54-2 x86_64 GNU/Linux
>            Reporter: Imran Patel
>            Priority: Critical
>              Labels: reliability
>
> We had an event recently where the temporarily loss of a leader for a 
> partition (during a manual restart), resulted in the leader coming back with 
> no high watermark state and truncating its log to zero. Logs (attached below) 
> indicate that it did have the data but not the commit state. How is this 
> possible?
> Leader (broker 3)
> [2015-12-18 21:19:44,666] INFO Completed load of log messages-14 with log end 
> offset 14175963374 (kafka.log.Log)
> [2015-12-18 21:19:45,170] INFO Partition [messages,14] on broker 3: No 
> checkpointed highwatermark is found for partition [messages,14] 
> (kafka.cluster.Partition)
> [2015-12-18 21:19:45,238] INFO Truncating log messages-14 to offset 0. 
> (kafka.log.Log)
> [2015-12-18 21:20:34,066] INFO Partition [messages,14] on broker 3: Expanding 
> ISR for partition [messages,14] from 3 to 3,10 (kafka.cluster.Partition)
> Replica (broker 10)
> [2015-12-18 21:19:19,525] INFO Partition [messages,14] on broker 10: 
> Shrinking ISR for partition [messages,14] from 3,10,4 to 10,4 
> (kafka.cluster.Partition)
> [2015-12-18 21:20:34,049] ERROR [ReplicaFetcherThread-0-3], Current offset 
> 14175984203 for partition [messages,14] out of range; reset offset to 35977 
> (kafka.server.ReplicaFetcherThread)
> [2015-12-18 21:20:34,033] WARN [ReplicaFetcherThread-0-3], Replica 10 for 
> partition [messages,14] reset its fetch offset from 14175984203 to current 
> leader 3's latest offset 35977 (kafka.server.ReplicaFetcherThread)
> Some relevant config parameters:
>         offsets.topic.replication.factor = 3
>         offsets.commit.required.acks = -1
>         replica.high.watermark.checkpoint.interval.ms = 5000
>         unclean.leader.election.enable = false



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to