Hi, Thanks for the answer. Looking at high water mark, then the logic would be to flag the partitions that have
high_watermark == log_start_offset In addition, I'm thinking that having the leader fulfill that criteria is enough to flag a partition, maybe check the replicas only if requested by the user. fre. 21. jun. 2019, 23:35 skrev Colin McCabe <cmcc...@apache.org>: > I don't think this requires a change in the protocol. It seems like you > should be able to use the high water mark to figure something out here? > > best, > Colin > > > On Fri, Jun 21, 2019, at 04:56, Carlos Manuel Duclos-Vergara wrote: > > Hi, > > > > This is an ancient task, but I feel it is still current today (specially > > since as somebody that deals with a Kafka cluster I know that this > happens > > more often than not). > > > > The task is about garbage collection of topics in a sort of automated > way. > > After some consideration I started a prototype implementation based on a > > manual process: > > > > 1. Using the cli, I can use the --describe-topic to get a list of topics > > that have size 0 > > 2. Massage that list into something that can be then fed into the cli and > > remove the topics that have size 0. > > > > The guiding principle here is the assumption that abandoned topics will > > eventually have size 0, because all records will expire. This is not true > > for all topics, but it covers a large portion of them and having > something > > like this would help admins to find "suspicious" topics at least. > > > > I started implementing this change and I realized that it would require a > > change in the protocol, because the sizes are never sent over the wire. > > Funny enough we collect the sizes of the log files, but we do not send > them. > > > > I think this kind of changes will require a KIP, but I wanted to ask what > > others think about this. > > > > The in-progress implementation of this can be found here: > > > https://github.com/carlosduclos/kafka/commit/0dffe5e131c3bd32b77f56b9be8eded89a96df54 > > > > Comments? > > > > -- > > Carlos Manuel Duclos Vergara > > Backend Software Developer > > >