Just a clarification based on Gwen's reply *log.segment.bytes* - by default this property is set to 1 GB. If we haven't set any value for *log.roll.ms <http://log.roll.ms>* , again by default it is set to 168 hours. In that case after every 1 GB, will it roll out new log segment file ?
<http://log.roll.ms> On Fri, Apr 8, 2016 at 11:32 AM Heath Ivie <hi...@autoanything.com> wrote: > Gwen, > > Thanks for the detailed reply. > > That makes it more clear for me. > > Heath > > -----Original Message----- > From: Gwen Shapira [mailto:g...@confluent.io] > Sent: Tuesday, April 05, 2016 6:13 PM > To: users@kafka.apache.org > Subject: Re: Log Retention: What gets deleted > > I think you got it almost right. The missing part is that we only delete > whole partition segments, not individual messages. > > As you are writing messages, every X bytes or Y milliseconds, a new file > gets created for the partition to store new messages in. Those files are > called segments. > The segment you are currently writing to is an active segment. > > We will never delete an active segment, so in order to delete old messages > we will look for an inactive segment where the newest message is older than > our retention and delete the entire segment. > > So there are several parameters controlling when will data get deleted > (I'm looking at just the time based, not the size-based): > 1. log.retention.ms - how old messages should be before we consider them > for deletion 2. log.roll.ms - how frequently we roll new segments. > Messages will not get deleted before a new segment is rolled 3. > log.retention.check.interval.ms - how frequently we check for segments > that we can delete. > > A message will be deleted if all 3 are true: > 1. It is older than log.retention.ms > 2. It is in an inactive segment, meaning enough time passed since the > message was written to roll a new segment 3. Kafka checked for segments > that can be deleted, meaning that more than check.interval.ms time passed > since the segment was rolled. > > Hope this helps, > > Gwen > > > > On Fri, Apr 1, 2016 at 12:21 PM, Heath Ivie <hi...@autoanything.com> > wrote: > > > Hi, > > > > I have some questions about the log retention and specifically what > > gets deleted. > > > > I have a test app where I am writing 10 logs to the topic every second. > > > > What I would expect is a lag in a group would be somewhere around 10 > > if I have retention.ms at 1000. > > > > What I am seeing that the lag continues to grow, but then at some > > point all messages are gone and the lag is at 0. > > > > I thought that the messages that are old would be deleted first. > > > > Am I misinterpreting how the log retention works? > > > > Heath Ivie > > Solutions Architect > > > > > > Warning: This e-mail may contain information proprietary to > > AutoAnything Inc. and is intended only for the use of the intended > > recipient(s). If the reader of this message is not the intended > > recipient(s), you have received this message in error and any review, > > dissemination, distribution or copying of this message is strictly > > prohibited. If you have received this message in error, please notify > > the sender immediately and delete all copies. > > >