Hi,
When lastModifiedTime on that segment is converted to human readable time:
Monday, April 27, 2020 9:14:19 AM UTC

In what time zone is the server (IOW: [2020-04-27 10:36:40,386] from the log is 
in what time zone)? 
It looks as largestTime is property of log record and 0 means the log record is 
empty.

    On Tuesday, April 28, 2020, 04:37:03 PM GMT+2, JP MB 
<jose.brandao1...@gmail.com> wrote:  
 
 Hi,
We have messages disappearing from topics on Apache Kafka with versions
2.3, 2.4.0, 2.4.1 and 2.5.0. We noticed this when we make a rolling
deployment of our clusters and unfortunately it doesn't happen every time,
so it's very inconsistent.

Sometimes we lose all messages inside a topic, other times we lose all
messages inside a partition. When this happens the following log is a
constant:

[2020-04-27 10:36:40,386] INFO [Log partition=test-lost-messages-5,
dir=/var/kafkadata/data01/data] Deleting segments
List(LogSegment(baseOffset=6, size=728,
lastModifiedTime=1587978859000, largestTime=0)) (kafka.log.Log)

There is also a previous log saying this segment hit the retention time
breach of 48 hours. In this example, the message was produced ~12 minutes
before the deployment.

Notice, all messages that are wrongly deleted havelargestTime=0 and the
ones that are properly deleted have a valid timestamp in there. From what
we read from documentation and code it looks like the largestTime is used
to calculate if a given segment reached the time breach or not.

Since we can observe this in multiple versions of Kafka, we think this
might be related to anything external to Kafka. E.g Zookeeper.

Does anyone have any ideas of why this could be happening?
For the record, we are using Zookeeper 3.6.0.
  

Reply via email to