Hi, thanks for updating the KIP. Couple of follow up comments:
* Nit: Why is "Reset to Earliest" and "Reset to Latest" a "reset by time" option -- IMHO it belongs to "reset by position"? * Nit: Description of "Reset to Earliest" > using Kafka Consumer's `auto.offset.reset` to `earliest` I think this is strictly speaking not correct (as auto.offset.reset only triggered if no valid offset is found, but this tool explicitly modified committed offset), and should be phrased as > using Kafka Consumer's #seekToBeginning() -> similar issue for description of "Reset to Latest" * Main option: rename to --reset-offsets (plural instead of singular) * Scenario Options: I would remove "reset" from all options, because the main argument "--reset-offset" says already what to do: > bin/kafka-consumer-groups.sh --reset-offset --reset-to-datetime XXX better (IMHO): > bin/kafka-consumer-groups.sh --reset-offsets --to-datetime XXX * Option 1.e ("print and export current offset") is not intuitive to use IMHO. The main option is "--reset-offset" but nothing happens if no scenario is specified. It is also not specified, what the output should look like? Furthermore, --describe should actually show currently committed offset for a group. So it seems to be redundant to have the same option in --reset-offsets * Option 2.a: I would rename to "--reset-to-offset" (or considering the comment above to "--to-offset") * Option 2.b and 2.c: I would unify to "--shift-offsets-by" (or similar) and accept positive/negative values * About Scope "all": maybe it's better to have an option "--all-topics" (or similar). IMHO explicit arguments are preferable over implicit setting to guard again accidental miss use of the tool. * Scope: I also think, that "--topic" (singular) and "--topics" (plural) are too similar and easy to use in a wrong way (ie, mix up) -- maybe we can have two options that are easier to distinguish. * I still think that JSON is not the best format (it's too verbose/hard to write for humans from scratch). A simple CSV format with implicit schema (topic,partition,offset) would be sufficient. * Why does the JSON contain "group_id" field -- there is parameter "--group" to specify the group ID. Would one overwrite the other (what order) or would there be an error if "--group" is used in combination with "--reset-from-file"? -Matthias On 2/17/17 6:43 AM, Jorge Esteban Quilcate Otoya wrote: > Hi, > > according to the feedback, I've updated the KIP: > > - We have added and ordered the scenarios, scopes and executions of the > Reset Offset tool. > - Consider it as an extension to the current `ConsumerGroupCommand` tool > - Execution will be possible without generating JSON files. > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+Reset+Consumer+Group+Offsets+tooling > > Looking forward to your feedback! > > Jorge. > > El mié., 8 feb. 2017 a las 23:23, Jorge Esteban Quilcate Otoya (< > quilcate.jo...@gmail.com>) escribió: > >> Great. I think I got the idea. What about this options: >> >> Scenarios: >> >> 1. Current status >> >> ´kafka-consumer-groups.sh --reset-offset --group cg1´ >> >> 2. To Datetime >> >> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-datetime >> 2017-01-01T00:00:00.000´ >> >> 3. To Period >> >> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-period P2D´ >> >> 4. To Earliest >> >> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-earliest´ >> >> 5. To Latest >> >> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-latest´ >> >> 6. Minus 'n' offsets >> >> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-minus n´ >> >> 7. Plus 'n' offsets >> >> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-plus n´ >> >> 8. To specific offset >> >> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to x´ >> >> Scopes: >> >> a. All topics used by Consumer Group >> >> Don't specify --topics >> >> b. Specific List of Topics >> >> Add list of values in --topics t1,t2,tn >> >> c. One Topic, all Partitions >> >> Add one topic and no partitions values: --topic t1 >> >> d. One Topic, List of Partitions >> >> Add one topic and partitions values: --topic t1 --partitions 0,1,2 >> >> About Reset Plan (JSON file): >> >> I think is still valid to have the option to persist reset configuration >> as a file, but I agree to give the option to run the tool without going >> down to the JSON file. >> >> Execution options: >> >> 1. Without execution argument (No args): >> >> Print out results (reset plan) >> >> 2. With --execute argument: >> >> Run reset process >> >> 3. With --output argument: >> >> Save result in a JSON format. >> >> 4. Only with --execute option and --reset-file (path to JSON) >> >> Reset based on file >> >> 4. Only with --verify option and --reset-file (path to JSON) >> >> Verify file values with current offsets >> >> I think we can remove --generate-and-execute because is a bit clumsy. >> >> With this options we will be able to execute with manual JSON >> configuration. >> >> >> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<b...@confluent.io>) >> escribió: >> >> Yes - using a tool like this to skip a set of consumer groups over a >> corrupt/bad message is definitely appealing. >> >> B >> >> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <g...@confluent.io> wrote: >> >>> I like the --reset-to-earliest and --reset-to-latest. In general, >>> since the JSON route is the most challenging for users, we want to >>> provide a lot of ways to do useful things without going there. >>> >>> Two things that can help: >>> >>> 1. A lot of times, users want to skip few messages that cause issues >>> and continue. maybe just specifying the topic, partition and delta >>> will be better than having to find the offset and write a JSON and >>> validate the JSON etc. >>> >>> 2. Thinking if there are other common use-cases that we can make easy >>> rather than just one generic but not very usable method. >>> >>> Gwen >>> >>> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya >>> <quilcate.jo...@gmail.com> wrote: >>>> Thanks for the feedback! >>>> >>>> @Onur, @Gwen: >>>> >>>> Agree. Actually at the first draft I considered to have it inside >>>> ´kafka-consumer-groups.sh´, but I decide to propose it as a standalone >>> tool >>>> to describe it clearly and focus it on reset functionality. >>>> >>>> But now that you mentioned, it does make sense to have it in >>>> ´kafka-consumer-groups.sh´. How would be a consistent way to introduce >>> it? >>>> >>>> Maybe something like this: >>>> >>>> ´kafka-consumer-groups.sh --reset-offset --generate --group cg1 >> --topics >>> t1 >>>> --reset-from 2017-01-01T00:00:00.000 --output plan.json´ >>>> >>>> ´kafka-consumer-groups.sh --reset-offset --verify --reset-json-file >>>> plan.json´ >>>> >>>> ´kafka-consumer-groups.sh --reset-offset --execute --reset-json-file >>>> plan.json´ >>>> >>>> ´kafka-consumer-groups.sh --reset-offset --generate-and-execute --group >>> cg1 >>>> --topics t1 --reset-from 2017-01-01T00:00:00.000´ >>>> >>>> @Gwen: >>>> >>>>> It looks exactly like the replica assignment tool >>>> >>>> It was influenced by ;-) I use the generate-verify-execute process here >>> to >>>> make sure user will be aware of the result of this operation. At the >>>> beginning we considered only add a couple of options to Consumer Group >>>> Command: >>>> >>>> --rewind-to-timestamp and --rewind-to-period >>>> >>>> @Onur: >>>> >>>>> You can actually get away with overriding while members of the group >>> are live >>>> with method 2 by using group information from DescribeGroupsRequest. >>>> >>>> This means that we need to have Consumer Group stopped before executing >>> and >>>> start a new consumer internally to do this? Therefore, we won't be able >>> to >>>> consider executing reset when ConsumerGroup is active? (trying to >> relate >>> it >>>> with @Dong 5th question) >>>> >>>> @Dong: >>>> >>>>> Should we allow user to use wildcard to reset offset of all groups >> for a >>>> given topic as well? >>>> >>>> I haven't thought about this scenario. Could be interesting. Following >>> the >>>> recommendation to add it into Consumer Group Command, in this case >> Group >>>> argument will be optional if there are only 1 topic. I think for >> multiple >>>> topic won't be that useful. >>>> >>>>> Should we allow user to specify timestamp per topic partition in the >>> json >>>> file as well? >>>> >>>> Don't think this could be a valid from the tool, but if Reset Plan is >>>> generated, and user want to set the offset for a specific partition to >>>> other offset (eventually based on another timestamp), and execute it, >> it >>>> will be up to her/him. >>>> >>>>> Should the script take some credential file to make sure that this >>>> operation is authenticated given the potential impact of this >> operation? >>>> >>>> Haven't tried to secure brokers yet, but the tool should support >>>> authorization if it's enabled in the broker. >>>> >>>>> Should we provide constant to reset committed offset to >> earliest/latest >>>> offset of a partition, e.g. -1 indicates earliest offset and -2 >> indicates >>>> latest offset. >>>> >>>> I will go for something like ´--reset-to-earliest´ and >>> ´--reset-to-latest´ >>>> >>>>> Should we allow dynamic change of the comitted offset when consumer >> are >>>> running, such that consumer will seek to the newly committed offset and >>>> start consuming from there? >>>> >>>> Not sure about this. I will recommend to keep it simple and ask user to >>>> stop consumers first. But I would considered it if the trade-offs are >>>> clear. >>>> >>>> @Matthias >>>> >>>> Added :). And thanks a lot for your help to define this KIP! >>>> >>>> >>>> >>>> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<g...@confluent.io>) >>>> escribió: >>>> >>>>> As long as the CLI is a bit consistent? Like, not just adding 3 >>>>> arguments and a JSON parser to the existing tool, right? >>>>> >>>>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman >>>>> <onurkaraman.apa...@gmail.com> wrote: >>>>>> I think it makes sense to just add the feature to >>>>> kafka-consumer-groups.sh >>>>>> >>>>>> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <g...@confluent.io> >>> wrote: >>>>>> >>>>>>> Thanks for the KIP. I'm super happy about adding the capability. >>>>>>> >>>>>>> I hate the interface, though. It looks exactly like the replica >>>>>>> assignment tool. A tool everyone loves so much that there are >>> multiple >>>>>>> projects, open and closed, that try to fix it. >>>>>>> >>>>>>> Can we swap it with something that looks a bit more like the >> consumer >>>>>>> group tool? or the kafka streams reset tool? Consistency is helpful >>> in >>>>>>> such cases. I spent some time learning existing tools and learning >>> yet >>>>>>> another one is a deterrent. >>>>>>> >>>>>>> Gwen >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya >>>>>>> <quilcate.jo...@gmail.com> wrote: >>>>>>>> Hi all, >>>>>>>> >>>>>>>> I would like to propose a KIP to Add a tool to Reset Consumer >> Group >>>>>>> Offsets. >>>>>>>> >>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- >>>>>>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets >>>>>>>> >>>>>>>> Please, take a look at the proposal and share your feedback. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Jorge. >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Gwen Shapira >>>>>>> Product Manager | Confluent >>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760> >> <(650)%20450-2760> | @gwenshap >>>>>>> Follow us: Twitter | blog >>>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Gwen Shapira >>>>> Product Manager | Confluent >>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760> >> <(650)%20450-2760> | @gwenshap >>>>> Follow us: Twitter | blog >>>>> >>> >>> >>> >>> -- >>> Gwen Shapira >>> Product Manager | Confluent >>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760> | @gwenshap >>> Follow us: Twitter | blog >>> >> >> >
signature.asc
Description: OpenPGP digital signature