A couple of comments:

* I made some minor, non-critical updates to the motivation section to add
a bit more color/background/clarity. In particular, clarifying how things
are connected to consumer groups for sinks. Still, the motivation isn't
entirely clear about how this works -- it is definitely a completely
offline tool. Specifying that users stop a connector entirely, read/modify
offsets, then restart, would probably be useful info for users of the
feature.
* re: naming, can we avoid including 'source' in the command name? even if
that's all it supports today, I don't think we want to restrict it. While
we only *require* this for source offsets, I think for users it will,
long-term, be way more natural to consider connect offsets generally and
not need to know how to drop down to consumer offsets for sink connectors.
In fact, in an ideal world, many users/operators may not even *understand*
consumer offsets deeply while still being generally familiar with connect
offsets. We can always include an error message for now if they try to do
something with a sink connector (which we presumably need to detect anyway
to correctly handle source connectors).
* re: naming of bin/ script, this is largely an artifact of my prototype
and i don't think an explicit decision (my bad), but currently the connect
commands don't have a kafka- prefix. I'm fine going either way, but if we
standardize on kafka-connect-*, I would like to see a simple follow up to
add the corresponding kafka-connect-standalone[.sh] and
kafka-connect-distributed[.sh] commands to get things consistent, ideally
with corresponding deprecations and plans for removal.
* Does export print all the offset commits currently available in the
topic, or only the most recent? I am curious since this tool could
potentially be useful for rewinding offsets if things get into a bad state,
in which case printing all offsets still available could be useful. I think
this should be a pretty safe extension since if you pipe all available
offsets -> write offsets, you'll still just get the latest offset commit
once the connector is restarted.
* This should probably specify schemaless data. I know there were issues
wrt internal converters that basically enforced this, but we should just be
explicit about it moving forward.
* nit, but i'd just change --connector-names -> --connectors. i don't think
we use -names anywhere else yet and the natural plural seems... more
natural to me
* While we've talked about it for awhile in various KIP and pre-KIP
discussions, standardizing on JSON input/output is new. I'm fine with it,
but I do wonder if we should include a format flag by default or just add
it as needed. All the "old" commands will need an explicit format flag if
we want them to support JSON.
* can we get rid of --offsets-file? isn't this the same as piping stdout to
a file? does it benefit us some other way i am missing? I was actually
confused at first b/c I thought --offsets-file was referring to the
standalone mode file to *read from* not the output JSON file to *write to*.
* export output: definitely a nit, but do we need to repeat the offset in
all the entries or should the top-level just be a map of connector -> list
of (partition, offset)? overhead here is probably minimal so I don't feel
strongly, but I don't see a good reason not to use a more efficient
encoding.
* > and output only those partition-offset pairs for a named connector:
 I am unclear whether the output file is compatible as a normal connect
offsets file, is only for this tool, or something else
* mentioned in the first item, but since deleting offsets w/ a null value
is mentioned at the end of the KIP, I'll reiterate that I think it is
important to call out when this applies -- if done on a live connector,
only a rebalance would guarantee it applies, and in the meantime something
could overwrite. only safe way to apply any of this, including offset
delete, is to stop the connector, apply operation, then continue.

-Ewen


On Sun, Sep 10, 2017 at 3:12 PM, Ted Yu <yuzhih...@gmail.com> wrote:

> bq. How about calling it kafka-connect-source-offset-tool.sh
>
> This is better.
> Going over existing .sh files, some have verb in their names while some
> don't.
>
> +1 from me.
>
> On Sun, Sep 10, 2017 at 2:53 PM, Randall Hauch <rha...@gmail.com> wrote:
>
> > Thanks for the comments! Specific comments inline below.
> >
> > Regards,
> >
> > Randall
> >
> > On Sun, Sep 10, 2017 at 2:42 PM, Ted Yu <yuzhih...@gmail.com> wrote:
> >
> > > bq. connector restart and the next message
> > >
> > > The last part of the sentence seems to be incomplete.
> > >
> >
> > Fixed.
> >
> >
> > > bq. command line tool called kafka-connect-source-offset-reset.sh
> > >
> > > From the description, the tool does more than resetting (e.g.
> deleting).
> > > How about calling it kafka-connect-source-offset-tool.sh
> > >
> >
> > None of the other tools or scripts in the Kafka distribution have "tool"
> in
> > the name, so how about "kafka-connect-source-offsets.sh"?
> >
> >
> > >
> > > bq. bin/kafka-connect-source-offset-reset.sh
> > > --config=my-worker-config.properties --export
> > >
> > > From the table above, export mode is mentioned as required. However,
> > either
> > > --export or --import is required.
> > > Better note this in the KIP.
> > >
> >
> > Addressed in a few places. Hopefully it is more clear now.
> >
> >
> > >
> > > bq. but will remove the offsets for the partition with file "b"
> > >
> > > Please move the sample JSON below the above description.
> > >
> >
> > Done.
> >
> >
> > >
> > > Cheers
> > >
> > > On Sun, Sep 10, 2017 at 11:12 AM, Randall Hauch <rha...@gmail.com>
> > wrote:
> > >
> > > > Hi all,
> > > >
> > > > KIP-199 [1] describes a new tool that will allow Connect operators to
> > > read,
> > > > update, and remove offsets stored by Connect runtime. This capability
> > has
> > > > been often asked for by Connect users. The proposal is simple but
> > > flexible.
> > > > Please review and add feedback.
> > > >
> > > > Best regards,
> > > >
> > > > Randall
> > > >
> > > > [1]
> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > 199%3A+Add+Kafka+Connect+offset+reset+tool
> > > >
> > >
> >
>

Reply via email to