Thank you Naresh. It answers my question. On Sat, Jan 20, 2018 at 8:15 AM, naresh Goud <nareshgoud.du...@gmail.com> wrote:
> Hi Rahul, > > If I understand your question correctly, you are intrested only in latest > values for keys and don't want to maintain any older values immediately > after update happened in a value for given key > > If you wanted all of segments to include in compaction consider this > property value how much it is? log.cleaner.min.compaction.lag.ms. > If you don't set this value all the messages/segments are eligible for > compaction except active segment. This will still not delete messages in > active segment/hot table. > > There is one property you can try out min.compaction.lag.ms to very > minimum > so that messages will be immediately available for compaction i.e it will > move the message from head log/segment(uncompacted) to compact eligible > segment. > > > Note: I still didn't answer your question of "why this behavior?". Just > shared my understanding to maintain updated values. > > Thank you, > Naresh > > > > > On Sat, Jan 20, 2018 at 7:47 AM Rahul Bhattacharjee < > rahul.rec....@gmail.com> > wrote: > > > Ok , so there is no attempt made for de-duplication while the row is > still > > hot in memtable. Why is this behaviour? > > For compact topics we are only interested in last update for any key. > > > > > > thanks, > > Rahul > > > > On Fri, Jan 19, 2018 at 3:18 PM, Matthias J. Sax <matth...@confluent.io> > > wrote: > > > > > Yes and no. > > > > > > There is a background compaction thread that runs periodically (you can > > > configure the scheduling for this thread). Thus, compaction happens > > async. > > > > > > It's correct, that the current head segments is not considered for > > > compaction. There is also no de-duplication on write, but message will > > > just be appended. > > > > > > You can also configure the segment size and roll behavior if you need > > > more "aggressive" compaction. > > > > > > > > > -Matthias > > > > > > On 1/19/18 1:21 PM, Matt Farmer wrote: > > > > Yeah, and I thought I answered your question? I think the compaction > > > happens when new segments are created. > > > > > > > > Sorry if I’m still misunderstanding. > > > > > > > >> On Jan 19, 2018, at 3:55 PM, Rahul Bhattacharjee < > > > rahul.rec....@gmail.com> wrote: > > > >> > > > >> Thanks Matt for the response .I was asking about the log compaction > > > >> <https://kafka.apache.org/documentation/#compaction> of kafka > topics. > > > >> > > > >> On Fri, Jan 19, 2018 at 12:36 PM, Matt Farmer <m...@frmr.me> wrote: > > > >> > > > >>> Someone will need to correct me if I’m wrong, but my understanding > is > > > that > > > >>> a topic log on disk is divided into segments. Compaction will occur > > > when a > > > >>> segment “rolls off” - so when a new active segment is created and > the > > > >>> previous segment becomes inactive. > > > >>> > > > >>> Segments can be bounded by size and time in topic and broker > > > configuration > > > >>> to get the effect that you want. > > > >>> > > > >>>> On Jan 19, 2018, at 2:10 PM, Rahul Bhattacharjee < > > > >>> rahul.rec....@gmail.com> wrote: > > > >>>> > > > >>>> Let's say we have a compacted topic (log.cleanup.policy=compact) > > where > > > >>> lot > > > >>>> of updates happen for relatively small set of keys. > > > >>>> My question is when does the compaction happen. > > > >>>> > > > >>>> In memtable , when a new update comes for an already existing key > in > > > >>>> memtable , the value is simple replaced. > > > >>>> or, > > > >>>> all the updates are associated with a offset , later the memtable > is > > > >>>> spilled to disk and the deletion happens during compaction phase. > > > >>>> > > > >>>> thanks, > > > >>>> Rahul > > > >>> > > > >>> > > > > > > > > > > > > >