These are generated html files. As with all documentation the source of truth lies in `apache/kafka`. For this case you'd need to look here https://github.com/apache/kafka/blob/a39fcac95c82133ac6d9116216ae819d0bf9a6bd/storage/src/main/java/org/apache/kafka/server/log/remote/storage/RemoteLogManagerConfig.java#L148 and here https://github.com/apache/kafka/blob/a39fcac95c82133ac6d9116216ae819d0bf9a6bd/clients/src/main/java/org/apache/kafka/common/config/TopicConfig.java#L92 for example.
Best, On Thu, Feb 27, 2025 at 1:12 PM אורי אהרוני <282...@gmail.com> wrote: > Well I've seen that the configuration descriptions for topics are only in > the 'kafka-site' in the file 'topic_config.html' and not in the kafka > repository. > Does it mean I can open PR only in 'kafka-site' with the change? > Thanks > > > בתאריך יום ה׳, 27 בפבר׳ 2025 ב-13:58 מאת Josep Prat > <josep.p...@aiven.io.invalid>: > > > Hi there, > > Documentation is in both repositories ( > > https://github.com/apache/kafka-site > > and https://github.com/apache/kafka). To submit a PR, you need to fork > the > > repo, make the changes and submit the PR. You can start by submitting a > PR > > changing the necessary files under > > https://github.com/apache/kafka/tree/trunk/docs and then we can open a > new > > one for the `kafka-site` repo. > > The Kafka repo is the source of truth, while the Kafka-site one is the > > repository that serves the website and it's properly versioned in place. > > > > Best, > > > > On Thu, Feb 27, 2025 at 12:53 PM אורי אהרוני <282...@gmail.com> > > wrote: > > > > > I would like to propose the docs change for retention.bytes > > > I see it's in this repo: https://github.com/apache/kafka-site > > > How could I get permission for opening PR or new issue? > > > > > > בתאריך יום ג׳, 25 בפבר׳ 2025 ב-11:25 מאת Brebner, Paul > > > <paul.breb...@netapp.com.invalid>: > > > > > > > Well spotted I think – I was briefly puzzled with the time retention > > > > behaviour, as segments seemed to live longer than advertised – until > I > > > > realised it was min time, deletion is lazy – can occur at some > > > (distant?) > > > > time in the future (and is async I think) – this was particularly > > > > noticeable for tiered storage (only time I’ve really understood how > > Kafka > > > > segments work and looked closely), Paul > > > > > > > > From: Matthias J. Sax <mj...@apache.org> > > > > Date: Tuesday, 25 February 2025 at 1:16 pm > > > > To: users@kafka.apache.org <users@kafka.apache.org> > > > > Subject: Re: Documentation and meaning of configuration > > 'retention.bytes' > > > > EXTERNAL EMAIL - USE CAUTION when clicking links or attachments > > > > > > > > > > > > > > > > > > > > I think you are right. Technically, it's a "minimum" not a "maximum". > > > > > > > > The cleanup happens async by the background log-cleaner thread. > > Segments > > > > which go beyond the "retention.bytes" config can be removed. > > > > > > > > I think it's just a difference between "technically correct" (ie, > > > > engineering / nerd language) and "regular English", ie, how normal > > > > people speak. > > > > > > > > I regular English one would say, "I limit the size to 1GB", even if > 1GB > > > > is not a strict limit (never larger then 1GB), but technically a > lower > > > > bound. > > > > > > > > > > > > > I would appreciate if you could fix and clarify that in the > > > > documentation. > > > > > > > > > > > > Feel free to open a PR for it :) > > > > > > > > > > > > > > > > > > > > -Matthias > > > > > > > > > > > > On 2/23/25 10:59 AM, אורי אהרוני wrote: > > > > > Hi, > > > > > I encountered a misunderstanding and I would like you to explain it > > to > > > me > > > > > or if possible change the documentation. > > > > > > > > > > The Kafka docs describes 'retention.bytes' configuration as: > > > > > This configuration controls the maximum size a partition (which > > > consists > > > > of > > > > > log segments) can grow to before we will discard old log segments > to > > > free > > > > > up space if we are using the "delete" retention policy > > > > > > > > > > Unfortunately I didn't fully understand the meaning of this field. > > > > > I interpret that as once a log segment reaches the > 'retention.bytes' > > > > field > > > > > - old segments will be deleted. > > > > > But for my understanding it is not the situation because like > > > > > retention.hours I believe it is a guarantee for the (minimum) size > of > > > > bytes > > > > > will be left for a partition. > > > > > > > > > > I will give an example for the differences: > > > > > An example from IBM: > > > > > A topic with retention.bytes of 1 GB, and with a log segment size > of > > > 512 > > > > MB: > > > > > > > > > > With one partition, it would reserve about 1.5 GB of storage. > > > > > In this case, the reserved size is significantly larger than the > > > > retention > > > > > size. > > > > > > > > > > In this example, there's a guarantee that our topic size won't be > > LESS > > > > THAN > > > > > 1 GB. > > > > > But from the docs I expect that once the topic reaches 1GB (or a > bit > > > > more), > > > > > old segments will be deleted. > > > > > In this example I would expect that when it reaches 1 GB, a segment > > > will > > > > be > > > > > automatically deleted and so the partition will be approximately 1 > GB > > > and > > > > > not 1.5 GB as said. > > > > > > > > > > My question is if I understood correctly the definition of the > field. > > > > > If not - I would be happy if you could explain what I missed. > > > > > If I'm correct that the definition is not well explained, I would > > > > > appreciate if you could fix and clarify that in the documentation. > > > > > Thanks, > > > > > Ori. > > > > > > > > > > > > > > > > > > -- > > > *Ori Aharoni* > > > > > > > > > -- > > [image: Aiven] <https://www.aiven.io> > > > > *Josep Prat* > > Engineering Director, Streaming Services, *Aiven* > > josep.p...@aiven.io | +491715557497 > > aiven.io <https://www.aiven.io> | < > https://www.facebook.com/aivencloud > > > > > <https://www.linkedin.com/company/aiven/> < > > https://twitter.com/aiven_io> > > *Aiven Deutschland GmbH* > > Alexanderufer 3-7, 10117 Berlin > > Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen, > > Anna Richardson, Kenneth Chen > > Amtsgericht Charlottenburg, HRB 209739 B > > > > > -- > *Ori Aharoni* > -- [image: Aiven] <https://www.aiven.io> *Josep Prat* Engineering Director, Streaming Services, *Aiven* josep.p...@aiven.io | +491715557497 aiven.io <https://www.aiven.io> | <https://www.facebook.com/aivencloud> <https://www.linkedin.com/company/aiven/> <https://twitter.com/aiven_io> *Aiven Deutschland GmbH* Alexanderufer 3-7, 10117 Berlin Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen, Anna Richardson, Kenneth Chen Amtsgericht Charlottenburg, HRB 209739 B