AFAIK, this behavior is changed in 0.10.1.0 release. Now retention is based
on the largest
timestamp of the messages in a log segment.

On Thu, Apr 20, 2017 at 11:19 AM, Gwilym Evans <gwilym.ev...@bigcommerce.com
> wrote:

> Hello,
>
> Yesterday, I had to replace a faulty Kafka broker node, and the method of
> replacement involved bringing up a blank replacement using the old broker's
> ID, thus triggering a replication of all its old partitions.
>
> Today I was dealing with disk usage alerts for only that broker: it turned
> out that the broker was not deleting old logs like the rest of the nodes.
>
> I haven't checked the code, but eventually I came to the conclusion that
> Kafka log file deletion is based on file create or modified time, rather
> than the max produce time of the messages within the log file itself.
>
> This makes the method I use of replacing a faulty node with a blank slate
> problematic, since five day old messages will be stored in a file with a
> recent c/mtime, thus won't be deleted and will soon cause disk space
> exhaustion.
>
> My temporary workaround was to reduce retention of the largest topic to 24
> hours but I'd prefer not doing that since it's more manual work and it
> breaks my SLA.
>
> Can this behaviour of Kafka be changed via configs at all?
>
> Has anyone faced a similar problem and have suggestions?
>
> Thanks,
> Gwilym
>

Reply via email to