I think you are right. Technically, it's a "minimum" not a "maximum".
The cleanup happens async by the background log-cleaner thread. Segments which go beyond the "retention.bytes" config can be removed.
I think it's just a difference between "technically correct" (ie, engineering / nerd language) and "regular English", ie, how normal people speak.
I regular English one would say, "I limit the size to 1GB", even if 1GB is not a strict limit (never larger then 1GB), but technically a lower bound.
I would appreciate if you could fix and clarify that in the documentation.
Feel free to open a PR for it :) -Matthias On 2/23/25 10:59 AM, אורי אהרוני wrote:
Hi, I encountered a misunderstanding and I would like you to explain it to me or if possible change the documentation. The Kafka docs describes 'retention.bytes' configuration as: This configuration controls the maximum size a partition (which consists of log segments) can grow to before we will discard old log segments to free up space if we are using the "delete" retention policy Unfortunately I didn't fully understand the meaning of this field. I interpret that as once a log segment reaches the 'retention.bytes' field - old segments will be deleted. But for my understanding it is not the situation because like retention.hours I believe it is a guarantee for the (minimum) size of bytes will be left for a partition. I will give an example for the differences: An example from IBM: A topic with retention.bytes of 1 GB, and with a log segment size of 512 MB: With one partition, it would reserve about 1.5 GB of storage. In this case, the reserved size is significantly larger than the retention size. In this example, there's a guarantee that our topic size won't be LESS THAN 1 GB. But from the docs I expect that once the topic reaches 1GB (or a bit more), old segments will be deleted. In this example I would expect that when it reaches 1 GB, a segment will be automatically deleted and so the partition will be approximately 1 GB and not 1.5 GB as said. My question is if I understood correctly the definition of the field. If not - I would be happy if you could explain what I missed. If I'm correct that the definition is not well explained, I would appreciate if you could fix and clarify that in the documentation. Thanks, Ori.