Chris,

Thanks for the comments!

1. I've updated the script to sync-manifests [--keep-not-found] with your
described semantics.

2. I've pushed all of the deprecation decisions to a follow-up KIP that can
take place after an intervening release.
I hope that this feature has a high enough ROI that the direction of the
migration is obvious to users even without a formal deprecation notice.
I do think that without a formal deprecation notice that the migration
scripts may be in use for a long time, particularly for connector
implementations which are not receiving active maintenance.

3. Thanks for looking into the ServiceLoader error handling contract.
>From my inspection of the ServiceLoader implementation, it seems to be able
to recover from most of the poor packaging scenarios that I'm aware of.
I think with or without the deprecation, operators are free to select the
ONLY_SCAN mode to work around any errors we don't discover before release.

Thanks,
Greg

On Mon, Feb 6, 2023 at 8:44 AM Chris Egerton <chr...@aiven.io.invalid>
wrote:

> Hi Greg,
>
> Thanks for the updates. Unless stated below, I agree with your responses. A
> few more thoughts:
>
> 1. IMO it's not worth it to have separate commands for adding/removing
> manifests, mostly because it adds complexity to the tool and might make it
> harder for users to understand. I think a single "update-manifests" or
> "sync-manifests" command that both adds manifests for found plugins and
> removes manifests for unfound plugins by default, with a --keep-unfound
> flag to disable removal on an opt-in basis, would make more sense.
>
> 2. I agree with this point you raise about deprecation: "if we agree that
> the old mechanism has some sunset date that we just don't know yet, we
> should still communicate to users that the sunset date is somewhere in the
> future." However, I'm not certain that releasing the features proposed in
> this KIP alone gives us enough confidence in them to sunset the legacy
> plugin loading logic, which is why I was hoping we could have at least one
> successful release (preferably even two or three) with this feature in
> order to identify potential gaps in the design before proceeding. This is
> similar to the strategy we took with incremental rebalancing in Connect,
> which introduced the "connect.protocol" property with the proposal that it
> could be marked deprecated in the next major release. That strategy served
> us well, since there were a few kinks in the logic for that feature that
> needed to be worked out, and many users continued to rely on the legacy
> rebalancing logic as a workaround for those issues.
>
> 3. I was wondering about error handling with the service loading mechanism
> because the Javadocs for the Iterator returned by ServiceLoader::iterator
> [1] state that "If an error is thrown then subsequent invocations of the
> iterator will make a best effort to locate and instantiate the next
> available provider, but in general such recovery cannot be guaranteed."
> Upon further reflection, I agree that it's fine for now to leave details on
> error handling out of the public contract for this new loading mode, but
> only if we also agree to hold off on deprecating the old loading mode,
> since there may be cases where worker startup is crippled by an
> improperly-packaged plugin when using the service loader mechanism for its
> plugins.
>
> [1] -
>
> https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/ServiceLoader.html#iterator()
>
> Cheers,
>
> Chris
>
> On Tue, Jan 24, 2023 at 10:26 AM Federico Valeri <fedeval...@gmail.com>
> wrote:
>
> > Hi Greg,
> >
> > On Tue, Jan 24, 2023 at 2:48 AM Greg Harris
> > <greg.har...@aiven.io.invalid> wrote:
> > >
> > > Federico,
> > >
> > > Thanks for taking a look!
> > >
> > > > I was just wondering if we should use a better script name like
> > > > "connect-convert-to-service-provider.sh" or something like this
> > >
> > > I agree that the current name is not ideal, while the current name
> > > describes part of what it is doing, it does not describe all of what it
> > is
> > > doing, or why that is important to the caller.
> > > I looked around to see what style the other Kafka commands use, and it
> > > seems like there's not a standard pattern. Some of the scripts are
> > > noun-phrases, and some are verb-phrases.
> > > Some take commands like `create`, `format`, etc, and some take options
> > like
> > > `--generate`, `--execute`, `--verify`.
> > > I've updated the command to `bin/connect-plugin-path.sh
> > > (list|add-manifests|remove-manifests) --worker-config <worker-config>`,
> > let
> > > me know if that's better or worse than
> `bin/connect-scan-plugin-path.sh`.
> > >
> > > > maybe add a --dry-run option.
> > >
> > > Thanks for the suggestion, I've updated the KIP to include a --dry-run
> > > option with some typical semantics.
> >
> > This is much better. Thanks!
> >
> > I think it's better to not deprecate the add-manifests and
> > remove-manifests script sub-commands. When we will remove the
> > deprecated plugin discovery modes, one may still have the need to
> > convert an old connector release, maybe because it's the only version
> > compatible with a third-party external system.
> >
> > >
> > > Chris,
> > >
> > > I'm glad you liked the personas! Since this KIP requires others' help
> to
> > > improve the ecosystem, I needed a way to make sure everyone knows what
> > role
> > > they have to play. I hope I've accomplished that.
> > >
> > > > 1. Will workers' resiliency to packaging issues be affected when the
> > new
> > > > scanning behavior is enabled? For example, will a single
> > poorly-packaged
> > > > plugin prevent other correctly-packaged plugins from being loaded, or
> > > > worse, cause the worker to fail? And either way, are there any
> details
> > > that
> > > > we can call out to back up any guarantees we want to provide on that
> > > front?
> > >
> > > The current behavior when encountering packaging issues seems to be to
> > log
> > > errors and continue, and that will certainly continue for the ONLY_SCAN
> > and
> > > HYBID_WARN modes.
> > > As written, the HYBRID_FAIL mode breaks from this trend and will throw
> > > exceptions that crash the worker only when the connectors are missing
> > from
> > > serviceloader discovery.
> > > The behavior for SERVICE_LOAD is not currently defined in the KIP. I
> > think
> > > we can either keep the log-and-continue behavior whenever possible, or
> > > begin to surface errors in a fail-fast model.
> > > And if we can't choose between those two, perhaps we can add
> > > SERVICE_LOAD_FAIL_FAST, or a plugin.path.discovery.fail.fast=true/false
> > to
> > > allow users to configure the worker to fail to start up when
> plugin.path
> > > issues are present.
> > >
> > > With regards to incorrectly packaged connectors affecting other
> correctly
> > > packaged ones: I think one of the current behaviors of the Reflections
> > > implementation is that certain exceptions can prevent discovery of
> other
> > > unrelated plugins on the path.
> > > I think that this may be able to be improved in all modes, and doesn't
> > need
> > > the ServiceLoader mechanism to resolve. I'll explore this some more and
> > > pursue a fix outside of this KIP.
> > > The new mechanism should be as good or better than the Reflections
> > > mechanism in this regard. I am not sure if it is necessary to commit to
> > the
> > > error handling behavior as part of the public API.
> > >
> > > > 2. I think the description of the "--plugin-location" flag may need
> > some
> > > > updating? In the middle of the list of bullet points there's "The
> > value of
> > > > this argument will be a single plugin, which can be any of the
> > following:"
> > > > but the following points don't appear to be related.
> > >
> > > Thanks, I think that was a copy-paste error. I have updated the KIP.
> > >
> > > > 3. (Nit) Could we rename the migration script something like
> > > > "connect-migrate-plugin-path" or even just "connect-plugin-path"
> with a
> > > > "migrate" subcommand?
> > >
> > > Yes, you and Federico had the same idea :)
> > > Please see my feedback above regarding naming.
> > >
> > > > 4. It's noted that "If a plugin class is removed from the path, the
> > > > corresponding shim manifest should also be removed". Did you consider
> > > > adding behavior (possibly guarded behind a CLI flag) to find and
> remove
> > > > manifests for nonexistent plugins?
> > >
> > > When writing that originally, I thought that the script would only
> remove
> > > manifests that it had generated, as filtered by some in-file comment or
> > by
> > > jar filename, which would be an implementation detail.
> > > If we were to add a CLI flag to also enable/disable the removal
> behavior,
> > > we could eliminate that implementation detail and make the removal an
> > > all-or-nothing part of the script.
> > >
> > > I updated the script to accept a `remove-manifests` subcommand which I
> > > think captures your suggestion. Therefore, a full sync would include
> two
> > > invocations: an `add-manifests` and `remove-manifests`.
> > >
> > > > 5. What's the concrete plan for updating the plugins that ship OOTB
> > with
> > > > Connect? Will they be given a service loader manifest, a module info
> > file,
> > > > or both?
> > >
> > > At a minimum, we need to add service loader manifests in this KIP to be
> > > compliant with the new discovery model.
> > > This is because Kafka 3.x supports Java 8, and that version of the
> > > ServiceLoader can only read manifest files, and not module-info files.
> > > Beyond that, I think it becomes a question of scope, and whether we
> > should
> > > add module-info files for all of the Connect components.
> > > I was not planning on doing so within this KIP, and would prefer to
> leave
> > > that for a follow-up KIP.
> > >
> > > I have clarified this in the KIP.
> > >
> > > > 6. In the HYBRID_WARN mode, what precedence will we give to plugins
> > that
> > > > are found by both the legacy scanning logic and the new service
> > > > loader/module info loading logic?
> > >
> > > In HYBRID_WARN, the control-flow should be the same as SCAN_ONLY, so
> the
> > > same classes are discovered in the same order.
> > > All of the classes that are found via the ServiceLoader mechanism
> should
> > > only be used by the runtime to generate warnings.
> > > I believe that by the nature of the Reflections library, all instances
> > > which are discoverable by the ServiceLoader should be discoverable by
> > > Reflections, and so we don't need to merge the two discovery methods or
> > > handle precedence.
> > >
> > > > 7. It's a bit strange to immediately deprecate config property values
> > and
> > > > the migration script in their first release. IMO as long as we have
> the
> > > > migration script in the project, it doesn't make sense to deprecate
> it,
> > > and
> > > > until we've had at least one successful release with the new loading
> > logic
> > > > and some confidence in it, we should not deprecate the older logic. I
> > > think
> > > > a more reasonable pace for the update here could be to deprecate the
> > > legacy
> > > > scanning logic in 4.0 (changing the default accordingly) and remove
> it
> > > > entirely in 5.0.
> > >
> > > I understand that it seems a bit silly to introduce something as
> > > deprecated, but I think that's because 'deprecated' is not the same as
> > > 'old', and not mutually exclusive with 'new'.
> > > The way I understand it, marking something as deprecated and the
> process
> > of
> > > deprecation is about communicating the direction in which an API is
> > > evolving, and which methods can be relied upon for the foreseeable
> > future.
> > > And if we agree that the old mechanism has some sunset date that we
> just
> > > don't know yet, we should still communicate to users that the sunset
> date
> > > is somewhere in the future.
> > >
> > > I've updated the KIP to describe separate follow-on votes for changing
> > the
> > > default, and removing the scanning behavior, as that was my original
> > intent
> > > but I don't think that was made clear.
> > > I also pushed back the deprecation of the migration sub-commands, since
> > > we'd like to usher people towards the migration script instead of
> relying
> > > on the HYBRID modes.
> > > I kept the decision for deprecating the non-SERVICE_LOAD modes in the
> > > current vote for now, but if introducing deprecated configurations is
> too
> > > unpalatable we can move the deprecation to a later vote.
> > >
> > > Thanks,
> > > Greg
> > >
> > > On Mon, Jan 23, 2023 at 12:49 PM Chris Egerton <chr...@aiven.io.invalid
> >
> > > wrote:
> > >
> > > > Hi Greg,
> > > >
> > > > Thanks for the KIP! This part of Connect has been due for an update
> > for a
> > > > while, it's nice to see it getting some attention. I especially like
> > how
> > > > the migration plan is broken down into affected personas; makes
> review
> > > > easier and, hopefully, makes the feature easier to understand for
> > affected
> > > > users.
> > > >
> > > > Here are my initial thoughts:
> > > >
> > > > 1. Will workers' resiliency to packaging issues be affected when the
> > new
> > > > scanning behavior is enabled? For example, will a single
> > poorly-packaged
> > > > plugin prevent other correctly-packaged plugins from being loaded, or
> > > > worse, cause the worker to fail? And either way, are there any
> details
> > that
> > > > we can call out to back up any guarantees we want to provide on that
> > front?
> > > >
> > > > 2. I think the description of the "--plugin-location" flag may need
> > some
> > > > updating? In the middle of the list of bullet points there's "The
> > value of
> > > > this argument will be a single plugin, which can be any of the
> > following:"
> > > > but the following points don't appear to be related.
> > > >
> > > > 3. (Nit) Could we rename the migration script something like
> > > > "connect-migrate-plugin-path" or even just "connect-plugin-path"
> with a
> > > > "migrate" subcommand?
> > > >
> > > > 4. It's noted that "If a plugin class is removed from the path, the
> > > > corresponding shim manifest should also be removed". Did you consider
> > > > adding behavior (possibly guarded behind a CLI flag) to find and
> remove
> > > > manifests for nonexistent plugins?
> > > >
> > > > 5. What's the concrete plan for updating the plugins that ship OOTB
> > with
> > > > Connect? Will they be given a service loader manifest, a module info
> > file,
> > > > or both?
> > > >
> > > > 6. In the HYBRID_WARN mode, what precedence will we give to plugins
> > that
> > > > are found by both the legacy scanning logic and the new service
> > > > loader/module info loading logic?
> > > >
> > > > 7. It's a bit strange to immediately deprecate config property values
> > and
> > > > the migration script in their first release. IMO as long as we have
> the
> > > > migration script in the project, it doesn't make sense to deprecate
> > it, and
> > > > until we've had at least one successful release with the new loading
> > logic
> > > > and some confidence in it, we should not deprecate the older logic. I
> > think
> > > > a more reasonable pace for the update here could be to deprecate the
> > legacy
> > > > scanning logic in 4.0 (changing the default accordingly) and remove
> it
> > > > entirely in 5.0.
> > > >
> > > > Cheers,
> > > >
> > > > Chris
> > > >
> > > > On Mon, Jan 23, 2023 at 3:21 AM Federico Valeri <
> fedeval...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi Greg, this looks like a useful change to me.
> > > > >
> > > > > I was just wondering if we should use a better script name like
> > > > > "connect-convert-to-service-provider.sh" or something like this,
> and
> > > > > maybe add a --dry-run option.
> > > > >
> > > > > On Tue, Jan 17, 2023 at 8:45 PM Greg Harris
> > > > > <greg.har...@aiven.io.invalid> wrote:
> > > > > >
> > > > > > Hi all!
> > > > > >
> > > > > > I'd like to start a discussion about
> > > > > >
> > > > >
> > > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-898%3A+Modernize+Connect+plugin+discovery
> > > > > > .
> > > > > >
> > > > > > Thanks!
> > > > > > Greg Harris
> > > > >
> > > >
> >
>

Reply via email to