Hello,
We have a cluster of 3 Kafka brokers working in KRaft mode as a docker 
containers, configured to clean up old messages with reteintion.ms parameter.
After some period of time Kafka suddenly stopped to clean up old messages in 
several partitions in one topic, while other partitions of same topic are 
cleaned up normally according to retention.ms.
After manual cleanup of affected partitions problem is gone and log segments 
are rolled and deleted according to retention.ms and segment.ms again without 
any issues.

Affected partitions are assigned to brokers correctly.
In broker logs there are no errors or warnings found.
>From logs attached below it can be seen that at some moment log segments just 
>stopped to be deleted, but new segments are rolled normally according to 
>segment.ms.

Topic configurations:
  cleanup.policy=delete
  compression.type=producer
  delete.retention.ms=86400000
  file.delete.delay.ms=60000
  flush.messages=9223372036854775807
  flush.ms=9223372036854775807
  follower.replication.throttled.replicas=
  index.interval.bytes=4096
  leader.replication.throttled.replicas=
  local.retention.bytes=-2
  local.retention.ms=-2
  max.compaction.lag.ms=9223372036854775807
  max.message.bytes=104857600
  message.downconversion.enable=true
  message.format.version=3.0-IV1
  message.timestamp.after.max.ms=9223372036854775807
  message.timestamp.before.max.ms=9223372036854775807
  message.timestamp.difference.max.ms=9223372036854775807
  message.timestamp.type=CreateTime
  min.cleanable.dirty.ratio=0.5
  min.compaction.lag.ms=0
  min.insync.replicas=1
  preallocate=false
  remote.storage.enable=false
  retention.bytes=-1
  retention.ms=43200000
  segment.bytes=1073741824
  segment.index.bytes=10485760
  segment.jitter.ms=0
  segment.ms=43200000
  unclean.leader.election.enable=false

Part of log related to one of affected partitions lifecycle:

{"log":"[2025-02-17 09:35:27,681] INFO [LocalLog partition=topic.name-24, 
dir=/bitnami/kafka/data] Rolled new log segment at offset 3053780 in 0 ms. 
(kafka.log.LocalLog)\n","stream":"stdout","time":"2025-02-17T09:35:27.681757986Z"}
{"log":"[2025-02-17 09:35:27,681] INFO [ProducerStateManager 
partition=topic.name-24] Wrote producer snapshot at offset 3053780 with 9 
producer ids in 0 ms. 
(org.apache.kafka.storage.internals.log.ProducerStateManager)\n","stream":"stdout","time":"2025-02-17T09:35:27.681904312Z"}
{"log":"[2025-02-17 21:00:40,200] INFO [UnifiedLog partition=topic.name-24, 
dir=/bitnami/kafka/data] Incremented log start offset to 3053267 due to segment 
deletion 
(kafka.log.UnifiedLog)\n","stream":"stdout","time":"2025-02-17T21:00:40.200910181Z"}
{"log":"[2025-02-17 21:00:40,201] INFO [UnifiedLog partition=topic.name-24, 
dir=/bitnami/kafka/data] Deleting segment LogSegment(baseOffset=3048556, 
size=92636371, lastModifiedTime=1739782794987, 
largestRecordTimestamp=1739782794807) due to log retention time 43200000ms 
breach based on the largest record timestamp in the segment 
(kafka.log.UnifiedLog)\n","stream":"stdout","time":"2025-02-17T21:00:40.201699088Z"}
{"log":"[2025-02-17 21:01:40,201] INFO [LocalLog partition=topic.name-24, 
dir=/bitnami/kafka/data] Deleting segment files LogSegment(baseOffset=3048556, 
size=92636371, lastModifiedTime=1739782794987, 
largestRecordTimestamp=1739782794807) 
(kafka.log.LocalLog$)\n","stream":"stdout","time":"2025-02-17T21:01:40.202002422Z"}
{"log":"[2025-02-17 21:01:40,201] INFO Deleted producer state snapshot 
/bitnami/kafka/data/topic.name-24/00000000000003048556.snapshot.deleted 
(org.apache.kafka.storage.internals.log.SnapshotFile)\n","stream":"stdout","time":"2025-02-17T21:01:40.202028871Z"}
{"log":"[2025-02-17 21:01:40,225] INFO Deleted log 
/bitnami/kafka/data/topic.name-24/00000000000003048556.log.deleted. 
(org.apache.kafka.storage.internals.log.LogSegment)\n","stream":"stdout","time":"2025-02-17T21:01:40.22611061Z"}
{"log":"[2025-02-17 21:01:40,226] INFO Deleted offset index 
/bitnami/kafka/data/topic.name-24/00000000000003048556.index.deleted. 
(org.apache.kafka.storage.internals.log.LogSegment)\n","stream":"stdout","time":"2025-02-17T21:01:40.226186892Z"}
{"log":"[2025-02-17 21:01:40,226] INFO Deleted time index 
/bitnami/kafka/data/topic.name-24/00000000000003048556.timeindex.deleted. 
(org.apache.kafka.storage.internals.log.LogSegment)\n","stream":"stdout","time":"2025-02-17T21:01:40.226203127Z"}
{"log":"[2025-02-17 21:35:40,198] INFO [UnifiedLog partition=topic.name-24, 
dir=/bitnami/kafka/data] Incremented log start offset to 3053780 due to segment 
deletion 
(kafka.log.UnifiedLog)\n","stream":"stdout","time":"2025-02-17T21:35:40.198570382Z"}
{"log":"[2025-02-17 21:35:40,199] INFO [UnifiedLog partition=topic.name-24, 
dir=/bitnami/kafka/data] Deleting segment LogSegment(baseOffset=3053267, 
size=7863464, lastModifiedTime=1739784914960, 
largestRecordTimestamp=1739784914996) due to log retention time 43200000ms 
breach based on the largest record timestamp in the segment 
(kafka.log.UnifiedLog)\n","stream":"stdout","time":"2025-02-17T21:35:40.200056724Z"}
{"log":"[2025-02-17 21:36:40,200] INFO [LocalLog partition=topic.name-24, 
dir=/bitnami/kafka/data] Deleting segment files LogSegment(baseOffset=3053267, 
size=7863464, lastModifiedTime=1739784914960, 
largestRecordTimestamp=1739784914996) 
(kafka.log.LocalLog$)\n","stream":"stdout","time":"2025-02-17T21:36:40.20029636Z"}
{"log":"[2025-02-17 21:36:40,202] INFO Deleted producer state snapshot 
/bitnami/kafka/data/topic.name-24/00000000000003053267.snapshot.deleted 
(org.apache.kafka.storage.internals.log.SnapshotFile)\n","stream":"stdout","time":"2025-02-17T21:36:40.202355814Z"}
{"log":"[2025-02-17 21:36:40,205] INFO Deleted log 
/bitnami/kafka/data/topic.name-24/00000000000003053267.log.deleted. 
(org.apache.kafka.storage.internals.log.LogSegment)\n","stream":"stdout","time":"2025-02-17T21:36:40.205770247Z"}
{"log":"[2025-02-17 21:36:40,205] INFO Deleted offset index 
/bitnami/kafka/data/topic.name-24/00000000000003053267.index.deleted. 
(org.apache.kafka.storage.internals.log.LogSegment)\n","stream":"stdout","time":"2025-02-17T21:36:40.206006967Z"}
{"log":"[2025-02-17 21:36:40,206] INFO Deleted time index 
/bitnami/kafka/data/topic.name-24/00000000000003053267.timeindex.deleted. 
(org.apache.kafka.storage.internals.log.LogSegment)\n","stream":"stdout","time":"2025-02-17T21:36:40.206457446Z"}
{"log":"[2025-02-21 17:12:46,160] INFO [LocalLog partition=topic.name-24, 
dir=/bitnami/kafka/data] Rolled new log segment at offset 3114191 in 2 ms. 
(kafka.log.LocalLog)\n","stream":"stdout","time":"2025-02-21T17:12:46.161085611Z"}
{"log":"[2025-02-21 17:12:46,161] INFO [ProducerStateManager 
partition=topic.name-24] Wrote producer snapshot at offset 3114191 with 7 
producer ids in 1 ms. 
(org.apache.kafka.storage.internals.log.ProducerStateManager)\n","stream":"stdout","time":"2025-02-21T17:12:46.161400854Z"}
{"log":"[2025-02-22 05:12:49,276] INFO [LocalLog partition=topic.name-24, 
dir=/bitnami/kafka/data] Rolled new log segment at offset 3119890 in 0 ms. 
(kafka.log.LocalLog)\n","stream":"stdout","time":"2025-02-22T05:12:49.27700611Z"}
{"log":"[2025-02-22 05:12:49,276] INFO [ProducerStateManager 
partition=topic.name-24] Wrote producer snapshot at offset 3119890 with 7 
producer ids in 1 ms. 
(org.apache.kafka.storage.internals.log.ProducerStateManager)\n","stream":"stdout","time":"2025-02-22T05:12:49.277087412Z"}
{"log":"[2025-02-22 17:12:27,376] INFO [LocalLog partition=topic.name-24, 
dir=/bitnami/kafka/data] Rolled new log segment at offset 3130184 in 1 ms. 
(kafka.log.LocalLog)\n","stream":"stdout","time":"2025-02-22T17:12:27.377141504Z"}
{"log":"[2025-02-22 17:12:27,377] INFO [ProducerStateManager 
partition=topic.name-24] Wrote producer snapshot at offset 3130184 with 7 
producer ids in 0 ms. 
(org.apache.kafka.storage.internals.log.ProducerStateManager)\n","stream":"stdout","time":"2025-02-22T17:12:27.37716661Z"}
{"log":"[2025-02-23 05:13:23,236] INFO [LocalLog partition=topic.name-24, 
dir=/bitnami/kafka/data] Rolled new log segment at offset 3136583 in 1 ms. 
(kafka.log.LocalLog)\n","stream":"stdout","time":"2025-02-23T05:13:23.236501907Z"}
{"log":"[2025-02-23 05:13:23,236] INFO [ProducerStateManager 
partition=topic.name-24] Wrote producer snapshot at offset 3136583 with 7 
producer ids in 0 ms. 
(org.apache.kafka.storage.internals.log.ProducerStateManager)\n","stream":"stdout","time":"2025-02-23T05:13:23.236553325Z"}
{"log":"[2025-02-23 17:14:02,679] INFO [LocalLog partition=topic.name-24, 
dir=/bitnami/kafka/data] Rolled new log segment at offset 3146102 in 1 ms. 
(kafka.log.LocalLog)\n","stream":"stdout","time":"2025-02-23T17:14:02.679379715Z"}
{"log":"[2025-02-23 17:14:02,679] INFO [ProducerStateManager 
partition=topic.name-24] Wrote producer snapshot at offset 3146102 with 7 
producer ids in 0 ms. 
(org.apache.kafka.storage.internals.log.ProducerStateManager)\n","stream":"stdout","time":"2025-02-23T17:14:02.67942178Z"}

LogCleaner JMX metrics:

kafka_log_logcleanermanager_max_dirty_percent 0.0
kafka_log_logcleanermanager_uncleanable_bytes{logdirectory="\"/bitnami/kafka/data\"",}
 0.0
kafka_log_logcleaner_max_clean_time_secs 0.0
kafka_log_logcleaner_max_buffer_utilization_percent 0.0
kafka_log_logcleanermanager_uncleanable_partitions_count{logdirectory="\"/bitnami/kafka/data\"",}
 0.0
kafka_log_logcleaner_deadthreadcount 0.0

Best regards,
Andrei Turygin
e-mail: turygin...@digitech.ru<mailto:turygin...@digitech.ru>

  • Kafka:3.7.1 problem with lo... Турыгин Андрей Андреевич

Reply via email to