Hello Peng. > In most use cases, Kafka serves as a messaging middleware where messages that have already been consumed are typically no longer needed and can be safely deleted.
I don't think we can make these kind of assumptions about how Kafka is used. In any case, defining which consumer groups need to have consumed a message before it can be deemed obsolete is likely often non-trivial. (E.g. at my workplace we have multiple instances of multiple apps consuming from the same topics. Configuring the required consumer group patterns for a feature like this would be impractical) For situations where Kafka downtime is a problem, I would expect monitoring on several important resources such as available memory, disk, CPU, general broker availability, and so on. There are several tools that already solve this and I don't think it's a good idea to try to solve it in Kafka as well. Also, this feature can create a false sense of security. There are other processes on the same machine that can use up available disk space, for example logging frameworks. This could lead to situations where other systems could starve kafka of resources over time, so you would need to monitor kafka disk usage anyway, making this feature redundant. Regards, ________________________________ From: peng <p1070048...@gmail.com> Sent: Friday, August 1, 2025 12:33 To: dev@kafka.apache.org <dev@kafka.apache.org> Subject: Re: [DISCUSS] KIP-1201: Add disk threshold strategy to prevent disk full failure [You don't often get email from p1070048...@gmail.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] EXTERNAL SENDER. Do not click links or open attachments unless you recognize the sender and know the content is safe. DO NOT provide your username or password. In most use cases, Kafka serves as a messaging middleware where messages that have already been consumed are typically no longer needed and can be safely deleted. Therefore, I propose enhancing the threshold strategy with an automatic deletion feature: When a broker's disk usage reaches 95%, it should automatically delete the oldest 10% of messages on the node to free up disk space, allowing new messages to be produced. This eliminates the need for manual cleanup while ensuring that new messages (which are almost always more critical than already-consumed data) take priority. Prevents disk-full scenarios by automatically removing stale data. No admin intervention required for basic cleanup. Fresh messages are never blocked by obsolete ones. The only potential risk arises if consumer groups experience significant lag where unconsumed messages might be deleted prematurely. However, in such cases, the root issue is the backlog itself―teams should prioritize resolving the lag rather than relying on retention. To accommodate different needs, we could introduce a `disk.threshold.policy` parameter, allowing users to choose between: 1. Rejecting new messages 2. Auto deleting the oldest messages Best regards mapan <mapan0...@gmail.com> 于 2025年7月31日周四 下午8:18写道: > Hi all, > > I’d like to start a discussion about a new KIP: > https://gbr01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fx%2FNw9JFg&data=05%7C02%7Cmartin.andersson%40kambi.com%7C8dc0a278e9d64863822308ddd0e6e300%7Ce3ec1ec4b9944e9e82e080234621871f%7C0%7C0%7C638896412260594576%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=0JWx%2F%2F7XxGbBJGopDB8C%2FxTfDmz0t90geQRN8jt9p%2B8%3D&reserved=0<https://cwiki.apache.org/confluence/x/Nw9JFg> > > This KIP suggests adding disk threshold configs in Kafka and rejecting new > product > requests after reaching the threshold to prevent disk full failure. > > This strategy is similar to RocketMQ's diskMaxUsedSpaceRatio config or > RabbitMQ's > disk_free_limit config, and I hope to implement this strategy in our > environment. > > Please share your feedback, questions, or concerns so we can refine > the proposal together. > > Best regards, > mapan > CONFIDENTIALITY NOTICE: This email message (and any attachment) is intended only for the individual or entity to which it is addressed. The information in this email is confidential and may contain information that is legally privileged or exempt from disclosure under applicable law. If you are not the intended recipient, you are strictly prohibited from reading, using, publishing or disseminating such information and upon receipt, must permanently delete the original and destroy any copies. We take steps to protect against viruses and other defects but advise you to carry out your own checks and precautions as Kambi does not accept any liability for any which remain. Thank you for your co-operation.