Re: Question about retention and log file times

Gwilym Evans Thu, 20 Apr 2017 00:33:11 -0700

Looking at release notes and various tickets, looks like this *is* meant to
be addressed now.


http://kafka.apache.org/documentation.html#upgrade_10_1_breaking

A related issue remains open though, lacking confirmation that it's fixed.

https://issues.apache.org/jira/browse/KAFKA-1379

For me, .timeindex files of all but the latest segment seem to be all zero
bytes, I wonder if that's problematic here?

...
-rw-r--r--  1 root root    1743368 Apr 20 00:59 00000000000150431147.index
-rw-r--r--  1 root root 1073739250 Apr 20 00:59 00000000000150431147.log
-rw-r--r--  1 root root          0 Apr 20 00:59
00000000000150431147.timeindex
-rw-r--r--  1 root root    1742824 Apr 20 05:16 00000000000150869687.index
-rw-r--r--  1 root root 1073741093 Apr 20 05:16 00000000000150869687.log
-rw-r--r--  1 root root          0 Apr 20 05:16
00000000000150869687.timeindex
-rw-r--r--  1 root root   10485760 Apr 20 07:29 00000000000151307816.index
-rw-r--r--  1 root root  535123214 Apr 20 07:29 00000000000151307816.log
-rw-r--r--  1 root root   10485756 Apr 20 05:16
00000000000151307816.timeindex

-Gwilym



On 20 April 2017 at 06:46, Gwilym Evans <[email protected]>
wrote:

> inter.broker.protocol.version = 0.10.1-IV2
> log.message.format.version = 0.10.1-IV2
>
> It will take me longer to check the producer/consumer versions, but I
> believe they're all *at least* 0.10
>
> -Gwilym
>
>
> On 20 April 2017 at 06:42, Manikumar <[email protected]> wrote:
>
>> You may be producing in the old message format.  Check the
>> "log.message.format.version" config.
>> What is the version of the Producer/Consumer clients?
>>
>>
>> On Thu, Apr 20, 2017 at 11:39 AM, Gwilym Evans <
>> [email protected]
>> > wrote:
>>
>> > I am running 0.10.1.0 so, if that's true, it might not be a default. If
>> you
>> > know of a config value to change that would be very helpful.
>> >
>> > -Gwilym
>> >
>> > On 20 April 2017 at 06:07, Manikumar <[email protected]> wrote:
>> >
>> > > AFAIK, this behavior is changed in 0.10.1.0 release. Now retention is
>> > based
>> > > on the largest
>> > > timestamp of the messages in a log segment.
>> > >
>> > > On Thu, Apr 20, 2017 at 11:19 AM, Gwilym Evans <
>> > > [email protected]
>> > > > wrote:
>> > >
>> > > > Hello,
>> > > >
>> > > > Yesterday, I had to replace a faulty Kafka broker node, and the
>> method
>> > of
>> > > > replacement involved bringing up a blank replacement using the old
>> > > broker's
>> > > > ID, thus triggering a replication of all its old partitions.
>> > > >
>> > > > Today I was dealing with disk usage alerts for only that broker: it
>> > > turned
>> > > > out that the broker was not deleting old logs like the rest of the
>> > nodes.
>> > > >
>> > > > I haven't checked the code, but eventually I came to the conclusion
>> > that
>> > > > Kafka log file deletion is based on file create or modified time,
>> > rather
>> > > > than the max produce time of the messages within the log file
>> itself.
>> > > >
>> > > > This makes the method I use of replacing a faulty node with a blank
>> > slate
>> > > > problematic, since five day old messages will be stored in a file
>> with
>> > a
>> > > > recent c/mtime, thus won't be deleted and will soon cause disk space
>> > > > exhaustion.
>> > > >
>> > > > My temporary workaround was to reduce retention of the largest
>> topic to
>> > > 24
>> > > > hours but I'd prefer not doing that since it's more manual work and
>> it
>> > > > breaks my SLA.
>> > > >
>> > > > Can this behaviour of Kafka be changed via configs at all?
>> > > >
>> > > > Has anyone faced a similar problem and have suggestions?
>> > > >
>> > > > Thanks,
>> > > > Gwilym
>> > > >
>> > >
>> >
>>
>
>

Re: Question about retention and log file times

Reply via email to