Hi Karan, I think what you are seeing with `--time -1` and '--time -2` confirms that the messages are deleted from the log. The offset returned in both cases is the same, which means that the offset start and offset end are both the same (i.e. the log is empty). When messages are removed from the log the offsets won't reset to 0. The offset index just keeps increasing, instead the offset start changes over time when log retention occurs.
So, in order to find the number of messages in a partition, you can just get the difference of the offsets returned from `--time -1` and `--time -2`. I hope this answers your question. Thanks. --Vahid From: karan alang <karan.al...@gmail.com> To: users@kafka.apache.org Date: 06/22/2017 11:14 PM Subject: Re: Deleting/Purging data from Kafka topics (Kafka 0.10) Hi Vahid, here is the output of the GetOffsetShell commands (with --time -1 & -2) $KAFKA10_HOME/bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list localhost:6092,localhost:6093,localhost:6094,localhost:6095 --topic topicPurge --time -2 --partitions 0,1,2 topicPurge:0:67 topicPurge:1:67 topicPurge:2:66 Karans-MacBook-Pro-3:config karanalang$ $KAFKA10_HOME/bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list localhost:6092,localhost:6093,localhost:6094,localhost:6095 --topic topicPurge --time -1 --partitions 0,1,2 topicPurge:0:67 topicPurge:1:67 topicPurge:2:66 So, how do i interpret the above ? I was expecting the zookeeper to be purged too .. & the offsets shown as 0, however that is not the case. (the observation seem to tally with what you put in your email,i think) Also, the consumer is not able to read any data.. so i guess the data is actually purged ? However, that also brings up additional questions .. I was using the GetOffsetShell command to get the count, but seems that is not necessarily the right way .. What command should be used to get the count ? On Thu, Jun 22, 2017 at 8:34 PM, Vahid S Hashemian < vahidhashem...@us.ibm.com> wrote: > Hi Karan, > > Just to clarify, with `--time -1` you are getting back the latest offset > of the partition. > If you do `--time -2` you'll get the earliest valid offset. > > So, let's say the latest offset of partition 0 of topic 'test' is 100. > When you publish 5 messages to the partition, and before retention policy > kicks in, > - with `--time -1` you should get test:0:105 > - with `--time -2` you should get test:0:100 > > But after retention policy kicks in and old messages are removed, > - with `--time -1` you should get test:0:105 > - with `--time -2` you should get test:0:105 > > Could you please advise whether you're seeing a different behavior? > > Thanks. > --Vahid > > > > > From: "Vahid S Hashemian" <vahidhashem...@us.ibm.com> > To: users@kafka.apache.org > Date: 06/22/2017 06:43 PM > Subject: Re: Deleting/Purging data from Kafka topics (Kafka 0.10) > > > > Hi Karan, > > I think the issue is in verification step. Because the start and end > offsets are not going to be reset when messages are deleted. > Have you checked whether a consumer would see the messages that are > supposed to be deleted? Thanks. > > --Vahid > > > > From: karan alang <karan.al...@gmail.com> > To: users@kafka.apache.org > Date: 06/22/2017 06:09 PM > Subject: Re: Deleting/Purging data from Kafka topics (Kafka 0.10) > > > > Hi Vahid, > > somehow, the changes suggested don't seem to be taking effect, and i dont > see the data being purged from the topic. > > Here are the steps i followed - > > 1) topic is set with param -- retention.ms=1000 > > $KAFKA10_HOME/bin/kafka-topics.sh --describe --topic topicPurge > --zookeeper > localhost:2161 > > Topic:topicPurge PartitionCount:3 ReplicationFactor:3 Configs:retention.ms > =1000 > > Topic: topicPurge Partition: 0 Leader: 3 Replicas: 3,1,2 Isr: 3,1,2 > > Topic: topicPurge Partition: 1 Leader: 0 Replicas: 0,2,3 Isr: 0,2,3 > > Topic: topicPurge Partition: 2 Leader: 1 Replicas: 1,3,0 Isr: 1,3,0 > > > 2) There are 4 brokers, and in the server.properties (for each of the > brokers), i've modified the following property > > log.retention.check.interval.ms=30000 > > I am expecting the data to be purged every 30 secs based on property - > log.retention.check.interval.ms, however, that does not seem to be > happening. > > 3) Here is the command to check the offsets > > $KAFKA10_HOME/bin/kafka-run-class.sh kafka.tools.GetOffsetShell > --broker-list localhost:6092,localhost:6093,localhost:6094,localhost:6095 > --topic topicPurge --time -1 --partitions 0,1,2 > > topicPurge:0:67 > > topicPurge:1:67 > > topicPurge:2:66 > > > Any ideas on what the issue might be ? > > > > > > > > On Thu, Jun 22, 2017 at 1:31 PM, Vahid S Hashemian < > vahidhashem...@us.ibm.com> wrote: > > > Hi Karan, > > > > The other broker config that plays a role here is > > "log.retention.check.interval.ms". > > For a low log retention time like in your example if this broker config > > value is much higher, then the broker doesn't delete old logs regular > > enough. > > > > --Vahid > > > > > > > > From: karan alang <karan.al...@gmail.com> > > To: users@kafka.apache.org > > Date: 06/22/2017 12:27 PM > > Subject: Deleting/Purging data from Kafka topics (Kafka 0.10) > > > > > > > > Hi All - > > How do i go about deleting data from Kafka Topics ? I've Kafka 0.10 > > installed. > > > > I tried setting the parameter of the topic as shown below -> > > > > $KAFKA10_HOME/bin/kafka-topics.sh --zookeeper localhost:2161 --alter > > --topic mmtopic6 --config retention.ms=1000 > > I was expecting to have the data purged in about a min or so .. > however, > > i > > dont see that happening .. > > any ideas on what needs to be done ? > > > > > > > > > > > > > > > > > > >