Hi Niklas, Default value of segment.ms is set to 10min as part of this project (introduced in Kafka 1.1.0):
https://jira.apache.org/jira/browse/KAFKA-6150 https://cwiki.apache.org/confluence/display/KAFKA/KIP-204+%3A+Adding+records+deletion+operation+to+the+new+Admin+Client+API In KIP-204 (KAFKA-6150), we added admin request to periodically delete records immediately upon committing offsets, to make repartition topics really "transient", and along with it we set the default segment.ms to 10min. The rationale is that to make record purging effective, we need to have smaller segment size so that we can delete those files after the purged offset is larger that the segment's last offset in time. Which Kafka version are you using currently? Did you observe that data purging did not happen (otherwise segment files should be garbage collected quickly), or is your traffic very small or commit infrequently which resulted in ineffective purging? Guozhang On Tue, Oct 9, 2018 at 4:07 AM, Niklas Lönn <niklas.l...@gmail.com> wrote: > Hi, > > Recently we experienced a problem when resetting a streams application, > doing quite a lot of operations based on 2 compacted source topics, with 20 > partitions. > > We crashed entire broker cluster with TooManyOpenFiles exception (We have a > multi million limit already) > > When inspecting the internal topics configuration I noticed that the > repartition topics have a default config of: > *Configs:segment.bytes=52428800,segment.index.bytes= > 52428800,cleanup.policy=delete,segment.ms > <http://segment.ms>=600000* > > My source topic is a compacted topic used as a KTable, and lets assume I > have data for every segment of 10min, I would quickly get 1.440 segments > per partition per day. > > Since this repartition topic is not even compacted, I cant understand the > reasoning behind having a default of 10min segment.ms and 50mb > segment.bytes? > > Is there any best process regarding this? Potentially we could crash the > cluster every-time we need to reset an application. > > And does it make sense that it would keep so many open files at the same > time in the first place? Could it be a bug in file management of the Kafka > broker? > > Kind regards > Niklas > -- -- Guozhang