Hello, We have a cluster of 3 Kafka brokers working in KRaft mode as a docker containers, configured to clean up old messages with reteintion.ms parameter. After some period of time Kafka suddenly stopped to clean up old messages in several partitions in one topic, while other partitions of same topic are cleaned up normally according to retention.ms. After manual cleanup of affected partitions problem is gone and log segments are rolled and deleted according to retention.ms and segment.ms again without any issues.
Affected partitions are assigned to brokers correctly. In broker logs there are no errors or warnings found. >From logs attached below it can be seen that at some moment log segments just >stopped to be deleted, but new segments are rolled normally according to >segment.ms. Topic configurations: cleanup.policy=delete compression.type=producer delete.retention.ms=86400000 file.delete.delay.ms=60000 flush.messages=9223372036854775807 flush.ms=9223372036854775807 follower.replication.throttled.replicas= index.interval.bytes=4096 leader.replication.throttled.replicas= local.retention.bytes=-2 local.retention.ms=-2 max.compaction.lag.ms=9223372036854775807 max.message.bytes=104857600 message.downconversion.enable=true message.format.version=3.0-IV1 message.timestamp.after.max.ms=9223372036854775807 message.timestamp.before.max.ms=9223372036854775807 message.timestamp.difference.max.ms=9223372036854775807 message.timestamp.type=CreateTime min.cleanable.dirty.ratio=0.5 min.compaction.lag.ms=0 min.insync.replicas=1 preallocate=false remote.storage.enable=false retention.bytes=-1 retention.ms=43200000 segment.bytes=1073741824 segment.index.bytes=10485760 segment.jitter.ms=0 segment.ms=43200000 unclean.leader.election.enable=false Part of log related to one of affected partitions lifecycle: {"log":"[2025-02-17 09:35:27,681] INFO [LocalLog partition=topic.name-24, dir=/bitnami/kafka/data] Rolled new log segment at offset 3053780 in 0 ms. (kafka.log.LocalLog)\n","stream":"stdout","time":"2025-02-17T09:35:27.681757986Z"} {"log":"[2025-02-17 09:35:27,681] INFO [ProducerStateManager partition=topic.name-24] Wrote producer snapshot at offset 3053780 with 9 producer ids in 0 ms. (org.apache.kafka.storage.internals.log.ProducerStateManager)\n","stream":"stdout","time":"2025-02-17T09:35:27.681904312Z"} {"log":"[2025-02-17 21:00:40,200] INFO [UnifiedLog partition=topic.name-24, dir=/bitnami/kafka/data] Incremented log start offset to 3053267 due to segment deletion (kafka.log.UnifiedLog)\n","stream":"stdout","time":"2025-02-17T21:00:40.200910181Z"} {"log":"[2025-02-17 21:00:40,201] INFO [UnifiedLog partition=topic.name-24, dir=/bitnami/kafka/data] Deleting segment LogSegment(baseOffset=3048556, size=92636371, lastModifiedTime=1739782794987, largestRecordTimestamp=1739782794807) due to log retention time 43200000ms breach based on the largest record timestamp in the segment (kafka.log.UnifiedLog)\n","stream":"stdout","time":"2025-02-17T21:00:40.201699088Z"} {"log":"[2025-02-17 21:01:40,201] INFO [LocalLog partition=topic.name-24, dir=/bitnami/kafka/data] Deleting segment files LogSegment(baseOffset=3048556, size=92636371, lastModifiedTime=1739782794987, largestRecordTimestamp=1739782794807) (kafka.log.LocalLog$)\n","stream":"stdout","time":"2025-02-17T21:01:40.202002422Z"} {"log":"[2025-02-17 21:01:40,201] INFO Deleted producer state snapshot /bitnami/kafka/data/topic.name-24/00000000000003048556.snapshot.deleted (org.apache.kafka.storage.internals.log.SnapshotFile)\n","stream":"stdout","time":"2025-02-17T21:01:40.202028871Z"} {"log":"[2025-02-17 21:01:40,225] INFO Deleted log /bitnami/kafka/data/topic.name-24/00000000000003048556.log.deleted. (org.apache.kafka.storage.internals.log.LogSegment)\n","stream":"stdout","time":"2025-02-17T21:01:40.22611061Z"} {"log":"[2025-02-17 21:01:40,226] INFO Deleted offset index /bitnami/kafka/data/topic.name-24/00000000000003048556.index.deleted. (org.apache.kafka.storage.internals.log.LogSegment)\n","stream":"stdout","time":"2025-02-17T21:01:40.226186892Z"} {"log":"[2025-02-17 21:01:40,226] INFO Deleted time index /bitnami/kafka/data/topic.name-24/00000000000003048556.timeindex.deleted. (org.apache.kafka.storage.internals.log.LogSegment)\n","stream":"stdout","time":"2025-02-17T21:01:40.226203127Z"} {"log":"[2025-02-17 21:35:40,198] INFO [UnifiedLog partition=topic.name-24, dir=/bitnami/kafka/data] Incremented log start offset to 3053780 due to segment deletion (kafka.log.UnifiedLog)\n","stream":"stdout","time":"2025-02-17T21:35:40.198570382Z"} {"log":"[2025-02-17 21:35:40,199] INFO [UnifiedLog partition=topic.name-24, dir=/bitnami/kafka/data] Deleting segment LogSegment(baseOffset=3053267, size=7863464, lastModifiedTime=1739784914960, largestRecordTimestamp=1739784914996) due to log retention time 43200000ms breach based on the largest record timestamp in the segment (kafka.log.UnifiedLog)\n","stream":"stdout","time":"2025-02-17T21:35:40.200056724Z"} {"log":"[2025-02-17 21:36:40,200] INFO [LocalLog partition=topic.name-24, dir=/bitnami/kafka/data] Deleting segment files LogSegment(baseOffset=3053267, size=7863464, lastModifiedTime=1739784914960, largestRecordTimestamp=1739784914996) (kafka.log.LocalLog$)\n","stream":"stdout","time":"2025-02-17T21:36:40.20029636Z"} {"log":"[2025-02-17 21:36:40,202] INFO Deleted producer state snapshot /bitnami/kafka/data/topic.name-24/00000000000003053267.snapshot.deleted (org.apache.kafka.storage.internals.log.SnapshotFile)\n","stream":"stdout","time":"2025-02-17T21:36:40.202355814Z"} {"log":"[2025-02-17 21:36:40,205] INFO Deleted log /bitnami/kafka/data/topic.name-24/00000000000003053267.log.deleted. (org.apache.kafka.storage.internals.log.LogSegment)\n","stream":"stdout","time":"2025-02-17T21:36:40.205770247Z"} {"log":"[2025-02-17 21:36:40,205] INFO Deleted offset index /bitnami/kafka/data/topic.name-24/00000000000003053267.index.deleted. (org.apache.kafka.storage.internals.log.LogSegment)\n","stream":"stdout","time":"2025-02-17T21:36:40.206006967Z"} {"log":"[2025-02-17 21:36:40,206] INFO Deleted time index /bitnami/kafka/data/topic.name-24/00000000000003053267.timeindex.deleted. (org.apache.kafka.storage.internals.log.LogSegment)\n","stream":"stdout","time":"2025-02-17T21:36:40.206457446Z"} {"log":"[2025-02-21 17:12:46,160] INFO [LocalLog partition=topic.name-24, dir=/bitnami/kafka/data] Rolled new log segment at offset 3114191 in 2 ms. (kafka.log.LocalLog)\n","stream":"stdout","time":"2025-02-21T17:12:46.161085611Z"} {"log":"[2025-02-21 17:12:46,161] INFO [ProducerStateManager partition=topic.name-24] Wrote producer snapshot at offset 3114191 with 7 producer ids in 1 ms. (org.apache.kafka.storage.internals.log.ProducerStateManager)\n","stream":"stdout","time":"2025-02-21T17:12:46.161400854Z"} {"log":"[2025-02-22 05:12:49,276] INFO [LocalLog partition=topic.name-24, dir=/bitnami/kafka/data] Rolled new log segment at offset 3119890 in 0 ms. (kafka.log.LocalLog)\n","stream":"stdout","time":"2025-02-22T05:12:49.27700611Z"} {"log":"[2025-02-22 05:12:49,276] INFO [ProducerStateManager partition=topic.name-24] Wrote producer snapshot at offset 3119890 with 7 producer ids in 1 ms. (org.apache.kafka.storage.internals.log.ProducerStateManager)\n","stream":"stdout","time":"2025-02-22T05:12:49.277087412Z"} {"log":"[2025-02-22 17:12:27,376] INFO [LocalLog partition=topic.name-24, dir=/bitnami/kafka/data] Rolled new log segment at offset 3130184 in 1 ms. (kafka.log.LocalLog)\n","stream":"stdout","time":"2025-02-22T17:12:27.377141504Z"} {"log":"[2025-02-22 17:12:27,377] INFO [ProducerStateManager partition=topic.name-24] Wrote producer snapshot at offset 3130184 with 7 producer ids in 0 ms. (org.apache.kafka.storage.internals.log.ProducerStateManager)\n","stream":"stdout","time":"2025-02-22T17:12:27.37716661Z"} {"log":"[2025-02-23 05:13:23,236] INFO [LocalLog partition=topic.name-24, dir=/bitnami/kafka/data] Rolled new log segment at offset 3136583 in 1 ms. (kafka.log.LocalLog)\n","stream":"stdout","time":"2025-02-23T05:13:23.236501907Z"} {"log":"[2025-02-23 05:13:23,236] INFO [ProducerStateManager partition=topic.name-24] Wrote producer snapshot at offset 3136583 with 7 producer ids in 0 ms. (org.apache.kafka.storage.internals.log.ProducerStateManager)\n","stream":"stdout","time":"2025-02-23T05:13:23.236553325Z"} {"log":"[2025-02-23 17:14:02,679] INFO [LocalLog partition=topic.name-24, dir=/bitnami/kafka/data] Rolled new log segment at offset 3146102 in 1 ms. (kafka.log.LocalLog)\n","stream":"stdout","time":"2025-02-23T17:14:02.679379715Z"} {"log":"[2025-02-23 17:14:02,679] INFO [ProducerStateManager partition=topic.name-24] Wrote producer snapshot at offset 3146102 with 7 producer ids in 0 ms. (org.apache.kafka.storage.internals.log.ProducerStateManager)\n","stream":"stdout","time":"2025-02-23T17:14:02.67942178Z"} LogCleaner JMX metrics: kafka_log_logcleanermanager_max_dirty_percent 0.0 kafka_log_logcleanermanager_uncleanable_bytes{logdirectory="\"/bitnami/kafka/data\"",} 0.0 kafka_log_logcleaner_max_clean_time_secs 0.0 kafka_log_logcleaner_max_buffer_utilization_percent 0.0 kafka_log_logcleanermanager_uncleanable_partitions_count{logdirectory="\"/bitnami/kafka/data\"",} 0.0 kafka_log_logcleaner_deadthreadcount 0.0 Best regards, Andrei Turygin e-mail: turygin...@digitech.ru<mailto:turygin...@digitech.ru>