Jorge, thanks for the update.
I would suggest to not reuse `ConsumerGroupCommand` and re implement what we need in `StreamsResetter` directly. Even if we need to keep `StreamsResetter` in `core` for now, I think we should not introduce new dependencies. Currently, we still use old `kafka.admin.AdmitClient` in `StreamsResetter`. We need new `KafkaAdminClient` to support "describe consumer group" to get rid of this part. Than we can move `StreamsResetter` to `streams` package. Cf. https://issues.apache.org/jira/browse/KAFKA-6058 https://issues.apache.org/jira/browse/KAFKA-5965 Feel free to pick up KAFKA-6058 and KAFKA-5965. -Matthias On 10/9/17 12:54 AM, Jorge Esteban Quilcate Otoya wrote: > Matthias, > > Thanks for the heads up! > > I think the main dependency is from `StreamResseter` to > `ConsumerGroupCommand` class to actually reuse `#reset-offsets` > functionality. > > Not sure what would be the better way to remove it. To expose commands > (e.g. `ConsumerGroupCommand`) as part of AdminClient, they have to be > re-implemented on the `client` module right? Is this an option? If not I > think we should keep `StreamResseter` as part of `core` module until we > have `ConsumerGroupCommand` on `client` module as well. > > El vie., 6 oct. 2017 a las 0:05, Matthias J. Sax (<matth...@confluent.io>) > escribió: > >> Jorge, >> >> KIP-198 (that got merged already) overlaps with this KIP. Can you please >> update your KIP accordingly? >> >> Also, while working on KIP-198, we identified some shortcomings in >> AdminClient that do not allow us to move StreamsResetter our of core >> package. We want to address those shortcoming in another KIP to add >> missing functionality to the new AdminClient. >> >> Having say this, and remembering a discussion about dependencies that >> might be introduced by this KIP, it might be good to understand those >> dependencies in detail. Maybe we can resolve those dependencies somehow >> and thus, be able to more StreamsResetter out of core package. Could you >> summarize those dependencies in the KIP or just as a reply? >> >> Thanks! >> >> >> -Matthias >> >> On 9/11/17 3:02 PM, Jorge Esteban Quilcate Otoya wrote: >>> Thanks Guozhang! >>> >>> I have updated the KIP to: >>> >>> 1. Only one scenario param is allowed. If none, `to-earliest` will be >> used, >>> behaving as the current version. >>> >>> 2. >>> 1. An exception will be printed mentioning that there is no existing >>> offsets registered. >>> 2. inputTopics format could support define partition numbers as in >>> reset-offsets option for kafka-consumer-groups. >>> >>> 3. That should be handled by KIP-198. >>> >>> I will start the VOTE thread in a following email. >>> >>> >>> El mié., 30 ago. 2017 a las 2:01, Guozhang Wang (<wangg...@gmail.com>) >>> escribió: >>> >>>> Hi Jorge, >>>> >>>> Thanks for the KIP. It would be a great to add feature to the reset >> tools. >>>> I made a pass over it and it looks good to me overall. I have a few >>>> comments: >>>> >>>> 1. For all the scenarios, do we allow users to specify more than one >>>> parameters? If not could you make that clear in the wiki, e.g. we would >>>> return with an error message saying that only one is allowed; if yes >> then >>>> what precedence order we are following? >>>> >>>> 2. Personally I feel that "--by-duration", "--to-offset" and >> "--shift-by" >>>> are a tad overkill, because 1) they assume there exist some committed >>>> offset for each of the topic, but that may not be always true, 2) >> offset / >>>> time shifting amount on different topics may not be a good fit >> universally, >>>> i.e. one could imagine the we want to reset all input topics to their >>>> offsets of a given time, but resetting all topics' offset to the same >> value >>>> or let all of them shifting the same amount of offsets are usually not >>>> applicable. For "--by-duration" it seems could be easily supported by >> the >>>> "to-date". >>>> >>>> For the general consumer group reset tool, since it could be set one per >>>> partition these parameters may be more useful. >>>> >>>> 3. As for the implementation details, when removing zookeeper config in >>>> `kafka-streams-application-reset`, we should consider return a meaning >>>> error message otherwise it would be "unrecognized config" blah. >>>> >>>> >>>> If you feel confident about the wiki after discussing about these >> points, >>>> please feel free to move on to start a voting thread. Note that we are >>>> about 3 weeks away from KIP deadline and 4 weeks away from feature >>>> deadline. >>>> >>>> >>>> Guozhang >>>> >>>> >>>> >>>> >>>> >>>> On Tue, Aug 22, 2017 at 1:45 PM, Matthias J. Sax <matth...@confluent.io >>> >>>> wrote: >>>> >>>>> Thanks for the update Jorge. >>>>> >>>>> I don't have any further comments. >>>>> >>>>> >>>>> -Matthias >>>>> >>>>> On 8/12/17 6:43 PM, Jorge Esteban Quilcate Otoya wrote: >>>>>> I have updated the KIP: >>>>>> >>>>>> - Change execution parameters, using `--dry-run` >>>>>> - Reference KAFKA-4327 >>>>>> - And advise about changes on `StreamResetter` >>>>>> >>>>>> Also includes that it will cover a change on `ConsumerGroupCommand` to >>>>>> align execution options. >>>>>> >>>>>> El dom., 16 jul. 2017 a las 5:37, Matthias J. Sax (< >>>>> matth...@confluent.io>) >>>>>> escribió: >>>>>> >>>>>>> Thanks a lot for the update! >>>>>>> >>>>>>> I like the KIP! >>>>>>> >>>>>>> One more question about `--dry-run` vs `--execute`: While I agree >> that >>>>>>> we should use the same flag for both tools, I am not sure which one >> is >>>>>>> the better one... My personal take is, that I like `--dry-run` >> better. >>>>>>> Not sure what others think. >>>>>>> >>>>>>> One more comment: with the removal of ZK, we can also tackle this >>>> JIRA: >>>>>>> https://issues.apache.org/jira/browse/KAFKA-4327 If we do so, I >> think >>>>> we >>>>>>> should mention it in the KIP. >>>>>>> >>>>>>> I am also not sure about backward compatibility issue for this case. >>>>>>> Actually, I don't expect people to call `StreamsResetter` from Java >>>>>>> code, but you can never know. So if we break this, we need to make >>>> sure >>>>>>> to cover it in the KIP and later on in the release notes. >>>>>>> >>>>>>> >>>>>>> -Matthias >>>>>>> >>>>>>> On 7/14/17 7:15 AM, Jorge Esteban Quilcate Otoya wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> KIP is updated. >>>>>>>> Changes: >>>>>>>> 1. Approach discussed to keep both tools (streams application >>>> resetter >>>>>>> and >>>>>>>> consumer group reset offset). >>>>>>>> 2. Options has been aligned between both tools. >>>>>>>> 3. Zookeeper option from streams-application-resetted will be >>>> removed, >>>>>>> and >>>>>>>> replaced internally for Kafka AdminClient. >>>>>>>> >>>>>>>> Looking forward to your feedback. >>>>>>>> >>>>>>>> El jue., 29 jun. 2017 a las 15:04, Matthias J. Sax (< >>>>>>> matth...@confluent.io>) >>>>>>>> escribió: >>>>>>>> >>>>>>>>> Damian, >>>>>>>>> >>>>>>>>>> resets everything and clears up >>>>>>>>>>> the state stores. >>>>>>>>> >>>>>>>>> That's not correct. The reset tool does not touch local store. For >>>>> this, >>>>>>>>> we have `KafkaStreams#clenup` -- otherwise, you would need to run >>>> the >>>>>>>>> tool in every machine you run an application instance. >>>>>>>>> >>>>>>>>> With regard to state, the tool only deletes the underlying >> changelog >>>>>>>>> topics. >>>>>>>>> >>>>>>>>> Just to clarify. The idea of allowing to rest with different offset >>>> is >>>>>>>>> to clear all state but just use a different start offset (instead >> of >>>>>>>>> zero). This is for use case where your topic has a larger retention >>>>> time >>>>>>>>> than the amount of data you want to reprocess. But we always need >> to >>>>>>>>> start with an empty state. (Resetting with consistent state is >>>>> something >>>>>>>>> we might do at some point, but it's much hard and not part of this >>>>> KIP) >>>>>>>>> >>>>>>>>>> @matthias, could we remove the ZK dependency from the streams >> reset >>>>>>> tool >>>>>>>>>> now? >>>>>>>>> >>>>>>>>> I think so. The new AdminClient provide the feature we need AFAIK. >> I >>>>>>>>> guess we can picky back this into the KIP (we would need a KIP >>>> anyway >>>>>>>>> because we deprecate `--zookeeper` what is an public API change). >>>>>>>>> >>>>>>>>> >>>>>>>>> I just want to point out, that even without ZK dependency, I prefer >>>> to >>>>>>>>> wrap the consumer offset tool and keep two tools. >>>>>>>>> >>>>>>>>> >>>>>>>>> -Matthias >>>>>>>>> >>>>>>>>> >>>>>>>>> On 6/29/17 9:14 AM, Damian Guy wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> Thanks for the KIP. What is not clear is how is this going to >>>> handle >>>>>>>>> state >>>>>>>>>> stores? Right now the streams reset tool, resets everything and >>>>> clears >>>>>>> up >>>>>>>>>> the state stores. What are we going to do if we reset to a >>>> particular >>>>>>>>>> offset? If we clear the state then we've lost any previously >>>>> aggregated >>>>>>>>>> values (which may or may not be what is expected). If we don't >>>> clear >>>>>>>>> them, >>>>>>>>>> then we will end up with incorrect aggregates. >>>>>>>>>> >>>>>>>>>> @matthias, could we remove the ZK dependency from the streams >> reset >>>>>>> tool >>>>>>>>>> now? >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Damian >>>>>>>>>> >>>>>>>>>> On Thu, 29 Jun 2017 at 15:22 Jorge Esteban Quilcate Otoya < >>>>>>>>>> quilcate.jo...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> You're right, I was not considering Zookeeper dependency. >>>>>>>>>>> >>>>>>>>>>> I'm starting to like the idea to wrap `reset-offset` from >>>>>>>>>>> `streams-application-reset`. >>>>>>>>>>> >>>>>>>>>>> I will wait a bit for more feedback and work on a draft with this >>>>>>>>> changes. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> El mié., 28 jun. 2017 a las 0:20, Matthias J. Sax (< >>>>>>>>> matth...@confluent.io >>>>>>>>>>>> ) >>>>>>>>>>> escribió: >>>>>>>>>>> >>>>>>>>>>>> I agree, that we should not duplicate functionality. >>>>>>>>>>>> >>>>>>>>>>>> However, I am worried, that a non-streams users using the offset >>>>>>> reset >>>>>>>>>>>> tool might delete topics unintentionally (even if the changes >> are >>>>>>>>> pretty >>>>>>>>>>>> small). Also, currently the stream reset tool required Zookeeper >>>>>>> while >>>>>>>>>>>> the offset reset tool does not -- I don't think we should add >>>> this >>>>>>>>>>>> dependency to the offset reset tool. >>>>>>>>>>>> >>>>>>>>>>>> Thus, it think it might be better to keep both tools, but >>>>> internally >>>>>>>>>>>> rewrite the streams reset entry class, to reuse as much as >>>> possible >>>>>>>>> from >>>>>>>>>>>> the offset reset tool. Ie. the streams tool would be a wrapper >>>>> around >>>>>>>>>>>> the offset tool and add some functionality it needs that is >>>> Streams >>>>>>>>>>>> specific. >>>>>>>>>>>> >>>>>>>>>>>> I also think, that keeping separate tools for consumers and >>>> streams >>>>>>> is >>>>>>>>> a >>>>>>>>>>>> good thing. We might want to add new features that don't apply >> to >>>>>>> plain >>>>>>>>>>>> consumers -- note, a Streams applications is more than just a >>>>> client. >>>>>>>>>>>> >>>>>>>>>>>> WDYT? >>>>>>>>>>>> >>>>>>>>>>>> Would be good to get some feedback from others, too. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -Matthias >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 6/27/17 9:05 AM, Jorge Esteban Quilcate Otoya wrote: >>>>>>>>>>>>> Thanks for the feedback Matthias! >>>>>>>>>>>>> >>>>>>>>>>>>> The main idea is to use only 1 tool to reset offsets and don't >>>>>>>>>>> replicate >>>>>>>>>>>>> functionality between tools. >>>>>>>>>>>>> Reset Offset (KIP-122) tool not only reset but support execute >>>> the >>>>>>>>>>> reset >>>>>>>>>>>>> but also export, import from files, etc. >>>>>>>>>>>>> If we extend the current tool (kafka-streams-application- >>>>> reset.sh) >>>>>>> we >>>>>>>>>>>> will >>>>>>>>>>>>> have to duplicate all this functionality also. >>>>>>>>>>>>> Maybe another option is to move the current implementation into >>>>>>>>>>>>> `kafka-consumer-group` and add a new command >>>>>>> `--reset-offset-streams` >>>>>>>>>>>> with >>>>>>>>>>>>> the current implementation functionality and add >>>> `--reset-offset` >>>>>>>>>>> options >>>>>>>>>>>>> for input topics. Does this make sense? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> El lun., 26 jun. 2017 a las 23:32, Matthias J. Sax (< >>>>>>>>>>>> matth...@confluent.io>) >>>>>>>>>>>>> escribió: >>>>>>>>>>>>> >>>>>>>>>>>>>> Jorge, >>>>>>>>>>>>>> >>>>>>>>>>>>>> thanks a lot for this KIP. Allowing the reset streams >>>>> applications >>>>>>>>>>> with >>>>>>>>>>>>>> arbitrary start offset is something we got multiple requests >>>>>>> already. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Couple of clarification question: >>>>>>>>>>>>>> >>>>>>>>>>>>>> - why do you want to deprecate the current tool instead of >>>>>>> extending >>>>>>>>>>>>>> the current tool with the stuff the offset reset tool can do >>>> (ie, >>>>>>> use >>>>>>>>>>>>>> the offset reset tool internally) >>>>>>>>>>>>>> >>>>>>>>>>>>>> - you suggest to extend the offset reset tool to replace the >>>>>>> stream >>>>>>>>>>>>>> reset tool: how would the reset tool know if it is resetting a >>>>>>>>> streams >>>>>>>>>>>>>> applications or a regular consumer group? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 6/26/17 1:28 PM, Jorge Esteban Quilcate Otoya wrote: >>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I'd like to start the discussion to add reset offset tooling >>>> for >>>>>>>>>>> Stream >>>>>>>>>>>>>>> applications. >>>>>>>>>>>>>>> The KIP can be found here: >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- >>>>> 171+-+Extend+Consumer+Group+Reset+Offset+for+Stream+Application >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> Jorge. >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> -- Guozhang >>>> >>> >> >> >
signature.asc
Description: OpenPGP digital signature