Is the log cleaner thread running? We've seen issues where the log cleaner
thread dies after too much logged data. You'll see a message like this:

[kafka-log-cleaner-thread-0], Error due to
java.lang.IllegalArgumentException: requirement failed: 9750860 messages in
segment MY_FAVORITE_TOPIC_IS_SORBET-2/00000000000047580165.log but offset
map can fit only 5033164. You can increase log.cleaner.dedupe.buffer.size
or decrease log.cleaner.threads

You can check if it's running by dumping threads using JMX and looking for
the thread name containing `kafka-log-cleaner-thread`

If this happens, there's not too much remediation you *can* do right now.
One potential is (assuming significant replication and enough other cluster
stability) is to delete the data on the broker and bring it up again, and
ensure the log cleaner is turned on the whole time. *hopefully* compaction
will keep up whilst kafka catches up with replication, but that's not
guaranteed.

We're going to be upstreaming a ticket shortly based on this and other
issues we've seen with log compaction.

On Wed, Jun 22, 2016 at 6:03 PM, Lawrence Weikum <lwei...@pandora.com>
wrote:

> We seem to be having a strange issue with a cluster of ours; specifically
> with the __consumer_offsets topic.
>
> When we brought the cluster online, log compaction was turned off.
> Realizing our mistake, we turned it on, but only after the topic had over
> 31,018,699,972 offsets committed to it.  Log compaction seems to have
> worked and be working properly.  The logs are showing that every partition
> has been compacted, and may pieces have been marked for deletion.
>
> The problem is that not all partitions are having their older logs
> deleted.  Some partitions will grow to having 19 log files, but a few
> seconds later will have only 13.  One partition in particular, though,
> still has all of its log files, all 19,000 of them, and this never seems to
> change, only grow as new offsets come in.
>
> Restarting that broker doesn’t seem to help.
>
>
> We’ve checked the broker settings on everything as well.
>
> log.cleaner.enable = true
> log.cleanup.policy = delete
> cleanup.policy = compact
>
>
> Has anyone encountered this issue before?
>
> Thank you all for the help!
>
> Lawrence Weikum
>
>

Reply via email to