[ 
https://issues.apache.org/jira/browse/KAFKA-559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13703007#comment-13703007
 ] 

Joel Koshy commented on KAFKA-559:
----------------------------------

Thanks for the patch. Overall, looks good.  Couple of comments, mostly minor
in no particular order:

* I think dry-run does not need any further qualifier such as withOptionalArg, 
describedAs, ofType - it's just a flag.
* For --since, I prefer the seconds since epoch over some fixed input format 
which then brings in ambiguity such as timezone 24h vs 12h, etc. A better 
alternative would be to accept date strings and use  the DateFormat class with 
lenient parsing turned on or something like that. --before may be more 
intuitive than --since.
* Can use CommandLineUtils.checkRequiredArgs
* deleteBy matching - prefer to use case match and thereby avoid the explicit 
check on valid values. Also the message on invalid value of deleteBy should 
inform what the valid values are.
* Right now you support the following modes: delete stale topics across all 
groups, delete stale topics in a specific group. I think it would be useful to 
make deleteBy optional - if unspecified, it scans all groups and gets rid of 
stale groups.
* line 75: warn ("msg", e)
* line 101: should provide a reason for aborting
* line 110: doesn't gropudirs have an offset path? if not maybe we should add it
* Logging should include last mtime as that may be useful information reported 
by the dry-run
* No need to add a wrapper shell script for the tool.
* make all of the methods except main private.
* The return statements can be dropped - i.e., just write the return value.
* Several vars can be vals instead.
* removeBrokerPartitionpairs: I don't think you would want to do a partial 
delete under a topic directory. You can check that all the partition offset 
paths are <= since and if so, just delete the topic path. With that the method 
would be better named something like deleteUnconsumedTopicsFromGroup?
* Finally, you are probably aware that there are a bunch of race conditions - 
e.g., checkIfLiveConsumers is a helpful check to have but not guaranteed to be 
correct as some consumers may creep in while the tool is running. However, I 
think it is reasonable for a tool like this to ignore that since a "since" 
value way back would mean the probability of that occuring is very low. Similar 
note for deleteGroupIfNoTopicExists.

                
> Garbage collect old consumer metadata entries
> ---------------------------------------------
>
>                 Key: KAFKA-559
>                 URL: https://issues.apache.org/jira/browse/KAFKA-559
>             Project: Kafka
>          Issue Type: New Feature
>            Reporter: Jay Kreps
>            Assignee: Tejas Patil
>              Labels: project
>         Attachments: KAFKA-559.v1.patch, KAFKA-559.v2.patch
>
>
> Many use cases involve tranient consumers. These consumers create entries 
> under their consumer group in zk and maintain offsets there as well. There is 
> currently no way to delete these entries. It would be good to have a tool 
> that did something like
>   bin/delete-obsolete-consumer-groups.sh [--topic t1] --since [date] 
> --zookeeper [zk_connect]
> This would scan through consumer group entries and delete any that had no 
> offset update since the given date.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to