Hello Peng.

> In most use cases, Kafka serves as a messaging middleware where messages
that have already been consumed are typically no longer needed and can be
safely deleted.

I don't think we can make these kind of assumptions about how Kafka is used. In 
any case, defining which consumer groups need to have consumed a message before 
it can be deemed obsolete is likely often non-trivial. (E.g. at my workplace we 
have multiple instances of multiple apps consuming from the same topics. 
Configuring the required consumer group patterns for a feature like this would 
be impractical)

For situations where Kafka downtime is a problem, I would expect monitoring on 
several important resources such as available memory, disk, CPU, general broker 
availability, and so on. There are several tools that already solve this and I 
don't think it's a good idea to try to solve it in Kafka as well.

Also, this feature can create a false sense of security. There are other 
processes on the same machine that can use up available disk space, for example 
logging frameworks. This could lead to situations where other systems could 
starve kafka of resources over time, so you would need to monitor kafka disk 
usage anyway, making this feature redundant.

Regards,

________________________________
From: peng <p1070048...@gmail.com>
Sent: Friday, August 1, 2025 12:33
To: dev@kafka.apache.org <dev@kafka.apache.org>
Subject: Re: [DISCUSS] KIP-1201: Add disk threshold strategy to prevent disk 
full failure

[You don't often get email from p1070048...@gmail.com. Learn why this is 
important at https://aka.ms/LearnAboutSenderIdentification ]

EXTERNAL SENDER. Do not click links or open attachments unless you recognize 
the sender and know the content is safe. DO NOT provide your username or 
password.


In most use cases, Kafka serves as a messaging middleware where messages
that have already been consumed are typically no longer needed and can be
safely deleted. Therefore, I propose enhancing the threshold strategy with
an automatic deletion feature:

When a broker's disk usage reaches 95%, it should automatically delete the
oldest 10% of messages on the node to free up disk space, allowing new
messages to be produced. This eliminates the need for manual cleanup while
ensuring that new messages (which are almost always more critical than
already-consumed data) take priority.

Prevents disk-full scenarios by automatically removing stale data.
No admin intervention required for basic cleanup.
 Fresh messages are never blocked by obsolete ones.

The only potential risk arises if consumer groups experience significant
lag where unconsumed messages might be deleted prematurely. However, in
such cases, the root issue is the backlog itself―teams should prioritize
resolving the lag rather than relying on retention.


To accommodate different needs, we could introduce a
`disk.threshold.policy` parameter, allowing users to choose between:
1. Rejecting new messages
2. Auto deleting the oldest messages


Best regards

mapan <mapan0...@gmail.com> 于 2025年7月31日周四 下午8:18写道:

> Hi all,
>
> I’d like to start a discussion about a new KIP:
> https://gbr01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fx%2FNw9JFg&data=05%7C02%7Cmartin.andersson%40kambi.com%7C8dc0a278e9d64863822308ddd0e6e300%7Ce3ec1ec4b9944e9e82e080234621871f%7C0%7C0%7C638896412260594576%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=0JWx%2F%2F7XxGbBJGopDB8C%2FxTfDmz0t90geQRN8jt9p%2B8%3D&reserved=0<https://cwiki.apache.org/confluence/x/Nw9JFg>
>
> This KIP suggests adding disk threshold configs in Kafka and rejecting new
> product
> requests after reaching the threshold to prevent disk full failure.
>
> This strategy is similar to RocketMQ's diskMaxUsedSpaceRatio config or
> RabbitMQ's
> disk_free_limit config, and I hope to implement this strategy in our
> environment.
>
> Please share your feedback, questions, or concerns so we can refine
> the proposal together.
>
> Best regards,
> mapan
>
CONFIDENTIALITY NOTICE: This email message (and any attachment) is intended 
only for the individual or entity to which it is addressed. The information in 
this email is confidential and may contain information that is legally 
privileged or exempt from disclosure under applicable law. If you are not the 
intended recipient, you are strictly prohibited from reading, using, publishing 
or disseminating such information and upon receipt, must permanently delete the 
original and destroy any copies. We take steps to protect against viruses and 
other defects but advise you to carry out your own checks and precautions as 
Kambi does not accept any liability for any which remain. Thank you for your 
co-operation.

Reply via email to