Hi Rahul,

If I understand your question correctly, you are intrested only in latest
values for keys and don't want to maintain any older values immediately
after update happened in a value for given key

If you wanted all of segments to include in compaction consider this
property value how much it is? log.cleaner.min.compaction.lag.ms.
If you don't set this value all the messages/segments are eligible for
compaction except active segment. This will still not delete messages in
active segment/hot table.

There is one property you can try out min.compaction.lag.ms to very minimum
so that messages will be immediately available for compaction i.e it will
move the message from head log/segment(uncompacted) to compact eligible
segment.


Note: I still didn't answer your question of "why this behavior?". Just
shared my understanding to maintain updated values.

Thank you,
Naresh




On Sat, Jan 20, 2018 at 7:47 AM Rahul Bhattacharjee <rahul.rec....@gmail.com>
wrote:

> Ok , so there is no attempt made for de-duplication while the row is still
> hot in memtable. Why is this behaviour?
> For compact topics we are only interested in last update for any key.
>
>
> thanks,
> Rahul
>
> On Fri, Jan 19, 2018 at 3:18 PM, Matthias J. Sax <matth...@confluent.io>
> wrote:
>
> > Yes and no.
> >
> > There is a background compaction thread that runs periodically (you can
> > configure the scheduling for this thread). Thus, compaction happens
> async.
> >
> > It's correct, that the current head segments is not considered for
> > compaction. There is also no de-duplication on write, but message will
> > just be appended.
> >
> > You can also configure the segment size and roll behavior if you need
> > more "aggressive" compaction.
> >
> >
> > -Matthias
> >
> > On 1/19/18 1:21 PM, Matt Farmer wrote:
> > > Yeah, and I thought I answered your question? I think the compaction
> > happens when new segments are created.
> > >
> > > Sorry if I’m still misunderstanding.
> > >
> > >> On Jan 19, 2018, at 3:55 PM, Rahul Bhattacharjee <
> > rahul.rec....@gmail.com> wrote:
> > >>
> > >> Thanks Matt for the response .I was asking about the log compaction
> > >> <https://kafka.apache.org/documentation/#compaction> of kafka topics.
> > >>
> > >> On Fri, Jan 19, 2018 at 12:36 PM, Matt Farmer <m...@frmr.me> wrote:
> > >>
> > >>> Someone will need to correct me if I’m wrong, but my understanding is
> > that
> > >>> a topic log on disk is divided into segments. Compaction will occur
> > when a
> > >>> segment “rolls off” - so when a new active segment is created and the
> > >>> previous segment becomes inactive.
> > >>>
> > >>> Segments can be bounded by size and time in topic and broker
> > configuration
> > >>> to get the effect that you want.
> > >>>
> > >>>> On Jan 19, 2018, at 2:10 PM, Rahul Bhattacharjee <
> > >>> rahul.rec....@gmail.com> wrote:
> > >>>>
> > >>>> Let's say we have a compacted topic (log.cleanup.policy=compact)
> where
> > >>> lot
> > >>>> of updates happen for relatively small set of keys.
> > >>>> My question is when does the compaction happen.
> > >>>>
> > >>>> In memtable , when a new update comes for an already existing key in
> > >>>> memtable , the value is simple replaced.
> > >>>> or,
> > >>>> all the updates are associated with a offset , later the memtable is
> > >>>> spilled to disk and the deletion happens during compaction phase.
> > >>>>
> > >>>> thanks,
> > >>>> Rahul
> > >>>
> > >>>
> > >
> >
> >
>

Reply via email to