We seem to be having a strange issue with a cluster of ours; specifically with the __consumer_offsets topic.
When we brought the cluster online, log compaction was turned off. Realizing our mistake, we turned it on, but only after the topic had over 31,018,699,972 offsets committed to it. Log compaction seems to have worked and be working properly. The logs are showing that every partition has been compacted, and may pieces have been marked for deletion. The problem is that not all partitions are having their older logs deleted. Some partitions will grow to having 19 log files, but a few seconds later will have only 13. One partition in particular, though, still has all of its log files, all 19,000 of them, and this never seems to change, only grow as new offsets come in. Restarting that broker doesn’t seem to help. We’ve checked the broker settings on everything as well. log.cleaner.enable = true log.cleanup.policy = delete cleanup.policy = compact Has anyone encountered this issue before? Thank you all for the help! Lawrence Weikum