In most use cases, Kafka serves as a messaging middleware where messages
that have already been consumed are typically no longer needed and can be
safely deleted. Therefore, I propose enhancing the threshold strategy with
an automatic deletion feature:

When a broker's disk usage reaches 95%, it should automatically delete the
oldest 10% of messages on the node to free up disk space, allowing new
messages to be produced. This eliminates the need for manual cleanup while
ensuring that new messages (which are almost always more critical than
already-consumed data) take priority.

Prevents disk-full scenarios by automatically removing stale data.
No admin intervention required for basic cleanup.
 Fresh messages are never blocked by obsolete ones.

The only potential risk arises if consumer groups experience significant
lag where unconsumed messages might be deleted prematurely. However, in
such cases, the root issue is the backlog itself—teams should prioritize
resolving the lag rather than relying on retention.


To accommodate different needs, we could introduce a
`disk.threshold.policy` parameter, allowing users to choose between:
1. Rejecting new messages
2. Auto deleting the oldest messages


Best regards

mapan <mapan0...@gmail.com> 于 2025年7月31日周四 下午8:18写道:

> Hi all,
>
> I’d like to start a discussion about a new KIP:
> https://cwiki.apache.org/confluence/x/Nw9JFg
>
> This KIP suggests adding disk threshold configs in Kafka and rejecting new
> product
> requests after reaching the threshold to prevent disk full failure.
>
> This strategy is similar to RocketMQ's diskMaxUsedSpaceRatio config or
> RabbitMQ's
> disk_free_limit config, and I hope to implement this strategy in our
> environment.
>
> Please share your feedback, questions, or concerns so we can refine
> the proposal together.
>
> Best regards,
> mapan
>

Reply via email to