These are generated html files. As with all documentation the source of
truth lies in `apache/kafka`. For this case you'd need to look here
https://github.com/apache/kafka/blob/a39fcac95c82133ac6d9116216ae819d0bf9a6bd/storage/src/main/java/org/apache/kafka/server/log/remote/storage/RemoteLogManagerConfig.java#L148
and here
https://github.com/apache/kafka/blob/a39fcac95c82133ac6d9116216ae819d0bf9a6bd/clients/src/main/java/org/apache/kafka/common/config/TopicConfig.java#L92
for example.

Best,

‪On Thu, Feb 27, 2025 at 1:12 PM ‫אורי אהרוני‬‎ <282...@gmail.com> wrote:‬

> Well I've seen that the configuration descriptions for topics are only in
> the 'kafka-site' in the file 'topic_config.html' and not in the kafka
> repository.
> Does it mean I can open PR only in 'kafka-site' with the change?
> Thanks
>
>
> ‫בתאריך יום ה׳, 27 בפבר׳ 2025 ב-13:58 מאת ‪Josep Prat‬‏
> <‪josep.p...@aiven.io.invalid‬‏>:‬
>
> > Hi there,
> > Documentation is in both repositories (
> > https://github.com/apache/kafka-site
> > and https://github.com/apache/kafka). To submit a PR, you need to fork
> the
> > repo, make the changes and submit the PR. You can start by submitting a
> PR
> > changing the necessary files under
> > https://github.com/apache/kafka/tree/trunk/docs and then we can open a
> new
> > one for the `kafka-site` repo.
> > The Kafka repo is the source of truth, while the Kafka-site one is the
> > repository that serves the website and it's properly versioned in place.
> >
> > Best,
> >
> > ‪On Thu, Feb 27, 2025 at 12:53 PM ‫אורי אהרוני‬‎ <282...@gmail.com>
> > wrote:‬
> >
> > > I would like to propose the docs change for retention.bytes
> > > I see it's in this repo: https://github.com/apache/kafka-site
> > > How could I get permission for opening PR or new issue?
> > >
> > > ‫בתאריך יום ג׳, 25 בפבר׳ 2025 ב-11:25 מאת ‪Brebner, Paul‬‏
> > > <‪paul.breb...@netapp.com.invalid‬‏>:‬
> > >
> > > > Well spotted I think – I was briefly puzzled with the time retention
> > > > behaviour, as segments seemed to live longer than advertised – until
> I
> > > > realised it was min time,  deletion is lazy – can occur at some
> > > (distant?)
> > > > time in the future (and is async I think) – this was particularly
> > > > noticeable for tiered storage (only time I’ve really understood how
> > Kafka
> > > > segments work and looked closely), Paul
> > > >
> > > > From: Matthias J. Sax <mj...@apache.org>
> > > > Date: Tuesday, 25 February 2025 at 1:16 pm
> > > > To: users@kafka.apache.org <users@kafka.apache.org>
> > > > Subject: Re: Documentation and meaning of configuration
> > 'retention.bytes'
> > > > EXTERNAL EMAIL - USE CAUTION when clicking links or attachments
> > > >
> > > >
> > > >
> > > >
> > > > I think you are right. Technically, it's a "minimum" not a "maximum".
> > > >
> > > > The cleanup happens async by the background log-cleaner thread.
> > Segments
> > > > which go beyond the "retention.bytes" config can be removed.
> > > >
> > > > I think it's just a difference between "technically correct" (ie,
> > > > engineering / nerd language) and "regular English", ie, how normal
> > > > people speak.
> > > >
> > > > I regular English one would say, "I limit the size to 1GB", even if
> 1GB
> > > > is not a strict limit (never larger then 1GB), but technically a
> lower
> > > > bound.
> > > >
> > > >
> > > > > I would appreciate if you could fix and clarify that in the
> > > > documentation.
> > > >
> > > >
> > > > Feel free to open a PR for it :)
> > > >
> > > >
> > > >
> > > >
> > > > -Matthias
> > > >
> > > >
> > > > On 2/23/25 10:59 AM, אורי אהרוני wrote:
> > > > > Hi,
> > > > > I encountered a misunderstanding and I would like you to explain it
> > to
> > > me
> > > > > or if possible change the documentation.
> > > > >
> > > > > The Kafka docs describes 'retention.bytes' configuration as:
> > > > > This configuration controls the maximum size a partition (which
> > > consists
> > > > of
> > > > > log segments) can grow to before we will discard old log segments
> to
> > > free
> > > > > up space if we are using the "delete" retention policy
> > > > >
> > > > > Unfortunately I didn't fully understand the meaning of this field.
> > > > > I interpret that as once a log segment reaches the
> 'retention.bytes'
> > > > field
> > > > > - old segments will be deleted.
> > > > > But for my understanding it is not the situation because like
> > > > > retention.hours I believe it is a guarantee for the (minimum) size
> of
> > > > bytes
> > > > > will be left for a partition.
> > > > >
> > > > > I will give an example for the differences:
> > > > > An example from IBM:
> > > > > A topic with retention.bytes of 1 GB, and with a log segment size
> of
> > > 512
> > > > MB:
> > > > >
> > > > > With one partition, it would reserve about 1.5 GB of storage.
> > > > > In this case, the reserved size is significantly larger than the
> > > > retention
> > > > > size.
> > > > >
> > > > > In this example, there's a guarantee that our topic size won't be
> > LESS
> > > > THAN
> > > > > 1 GB.
> > > > > But from the docs I expect that once the topic reaches 1GB (or a
> bit
> > > > more),
> > > > > old segments will be deleted.
> > > > > In this example I would expect that when it reaches 1 GB, a segment
> > > will
> > > > be
> > > > > automatically deleted and so the partition will be approximately 1
> GB
> > > and
> > > > > not 1.5 GB as said.
> > > > >
> > > > > My question is if I understood correctly the definition of the
> field.
> > > > > If not - I would be happy if you could explain what I missed.
> > > > > If I'm correct that the definition is not well explained, I would
> > > > > appreciate if you could fix and clarify that in the documentation.
> > > > > Thanks,
> > > > > Ori.
> > > > >
> > > >
> > >
> > >
> > > --
> > > *Ori Aharoni*
> > >
> >
> >
> > --
> > [image: Aiven] <https://www.aiven.io>
> >
> > *Josep Prat*
> > Engineering Director, Streaming Services, *Aiven*
> > josep.p...@aiven.io   |   +491715557497
> > aiven.io <https://www.aiven.io>   |   <
> https://www.facebook.com/aivencloud
> > >
> >   <https://www.linkedin.com/company/aiven/>   <
> > https://twitter.com/aiven_io>
> > *Aiven Deutschland GmbH*
> > Alexanderufer 3-7, 10117 Berlin
> > Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen,
> > Anna Richardson, Kenneth Chen
> > Amtsgericht Charlottenburg, HRB 209739 B
> >
>
>
> --
> *Ori Aharoni*
>


-- 
[image: Aiven] <https://www.aiven.io>

*Josep Prat*
Engineering Director, Streaming Services, *Aiven*
josep.p...@aiven.io   |   +491715557497
aiven.io <https://www.aiven.io>   |   <https://www.facebook.com/aivencloud>
  <https://www.linkedin.com/company/aiven/>   <https://twitter.com/aiven_io>
*Aiven Deutschland GmbH*
Alexanderufer 3-7, 10117 Berlin
Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen,
Anna Richardson, Kenneth Chen
Amtsgericht Charlottenburg, HRB 209739 B

Reply via email to