Have you configured log.retention.bytes? Thanks,
Jun On Thu, Jul 24, 2014 at 10:04 AM, Kashyap Paidimarri <kashy...@gmail.com> wrote: > We just noticed that one of our topics has been horribly misbehaving. > > *retention.ms <http://retention.ms>* for the topic is set to 1209600000 ms > > However, segments are getting schedule for deletetion as soon as a new one > is rolled over. And naturally consumers are running into a > kafka.common.OffsetOutOfRangeException whenever this happens. > > Is this a known bug? It is incredibly serious. We seem to have lost about > 40 million messages on a single topic and are yet to figure out what all > topics are affected. > > I thought of restarting Kafka but figured I'd leave it untouched while I > figure out what I can capture for finding the root cause. > > Meanwhile in order to keep from losing any more data, I have a periodic job > that is doing a *'cp -al' *of the partitions into a separate folder. That > way Kafka goes ahead and deletes the segment but the data is not lost from > the filesystem. > > If this is a unseen bug, what should I save from the running instance. > > By the way, this has affected all partitions and replicas of the topic and > not on a specific host. >