Federico, Thanks for taking a look!
> I was just wondering if we should use a better script name like > "connect-convert-to-service-provider.sh" or something like this I agree that the current name is not ideal, while the current name describes part of what it is doing, it does not describe all of what it is doing, or why that is important to the caller. I looked around to see what style the other Kafka commands use, and it seems like there's not a standard pattern. Some of the scripts are noun-phrases, and some are verb-phrases. Some take commands like `create`, `format`, etc, and some take options like `--generate`, `--execute`, `--verify`. I've updated the command to `bin/connect-plugin-path.sh (list|add-manifests|remove-manifests) --worker-config <worker-config>`, let me know if that's better or worse than `bin/connect-scan-plugin-path.sh`. > maybe add a --dry-run option. Thanks for the suggestion, I've updated the KIP to include a --dry-run option with some typical semantics. Chris, I'm glad you liked the personas! Since this KIP requires others' help to improve the ecosystem, I needed a way to make sure everyone knows what role they have to play. I hope I've accomplished that. > 1. Will workers' resiliency to packaging issues be affected when the new > scanning behavior is enabled? For example, will a single poorly-packaged > plugin prevent other correctly-packaged plugins from being loaded, or > worse, cause the worker to fail? And either way, are there any details that > we can call out to back up any guarantees we want to provide on that front? The current behavior when encountering packaging issues seems to be to log errors and continue, and that will certainly continue for the ONLY_SCAN and HYBID_WARN modes. As written, the HYBRID_FAIL mode breaks from this trend and will throw exceptions that crash the worker only when the connectors are missing from serviceloader discovery. The behavior for SERVICE_LOAD is not currently defined in the KIP. I think we can either keep the log-and-continue behavior whenever possible, or begin to surface errors in a fail-fast model. And if we can't choose between those two, perhaps we can add SERVICE_LOAD_FAIL_FAST, or a plugin.path.discovery.fail.fast=true/false to allow users to configure the worker to fail to start up when plugin.path issues are present. With regards to incorrectly packaged connectors affecting other correctly packaged ones: I think one of the current behaviors of the Reflections implementation is that certain exceptions can prevent discovery of other unrelated plugins on the path. I think that this may be able to be improved in all modes, and doesn't need the ServiceLoader mechanism to resolve. I'll explore this some more and pursue a fix outside of this KIP. The new mechanism should be as good or better than the Reflections mechanism in this regard. I am not sure if it is necessary to commit to the error handling behavior as part of the public API. > 2. I think the description of the "--plugin-location" flag may need some > updating? In the middle of the list of bullet points there's "The value of > this argument will be a single plugin, which can be any of the following:" > but the following points don't appear to be related. Thanks, I think that was a copy-paste error. I have updated the KIP. > 3. (Nit) Could we rename the migration script something like > "connect-migrate-plugin-path" or even just "connect-plugin-path" with a > "migrate" subcommand? Yes, you and Federico had the same idea :) Please see my feedback above regarding naming. > 4. It's noted that "If a plugin class is removed from the path, the > corresponding shim manifest should also be removed". Did you consider > adding behavior (possibly guarded behind a CLI flag) to find and remove > manifests for nonexistent plugins? When writing that originally, I thought that the script would only remove manifests that it had generated, as filtered by some in-file comment or by jar filename, which would be an implementation detail. If we were to add a CLI flag to also enable/disable the removal behavior, we could eliminate that implementation detail and make the removal an all-or-nothing part of the script. I updated the script to accept a `remove-manifests` subcommand which I think captures your suggestion. Therefore, a full sync would include two invocations: an `add-manifests` and `remove-manifests`. > 5. What's the concrete plan for updating the plugins that ship OOTB with > Connect? Will they be given a service loader manifest, a module info file, > or both? At a minimum, we need to add service loader manifests in this KIP to be compliant with the new discovery model. This is because Kafka 3.x supports Java 8, and that version of the ServiceLoader can only read manifest files, and not module-info files. Beyond that, I think it becomes a question of scope, and whether we should add module-info files for all of the Connect components. I was not planning on doing so within this KIP, and would prefer to leave that for a follow-up KIP. I have clarified this in the KIP. > 6. In the HYBRID_WARN mode, what precedence will we give to plugins that > are found by both the legacy scanning logic and the new service > loader/module info loading logic? In HYBRID_WARN, the control-flow should be the same as SCAN_ONLY, so the same classes are discovered in the same order. All of the classes that are found via the ServiceLoader mechanism should only be used by the runtime to generate warnings. I believe that by the nature of the Reflections library, all instances which are discoverable by the ServiceLoader should be discoverable by Reflections, and so we don't need to merge the two discovery methods or handle precedence. > 7. It's a bit strange to immediately deprecate config property values and > the migration script in their first release. IMO as long as we have the > migration script in the project, it doesn't make sense to deprecate it, and > until we've had at least one successful release with the new loading logic > and some confidence in it, we should not deprecate the older logic. I think > a more reasonable pace for the update here could be to deprecate the legacy > scanning logic in 4.0 (changing the default accordingly) and remove it > entirely in 5.0. I understand that it seems a bit silly to introduce something as deprecated, but I think that's because 'deprecated' is not the same as 'old', and not mutually exclusive with 'new'. The way I understand it, marking something as deprecated and the process of deprecation is about communicating the direction in which an API is evolving, and which methods can be relied upon for the foreseeable future. And if we agree that the old mechanism has some sunset date that we just don't know yet, we should still communicate to users that the sunset date is somewhere in the future. I've updated the KIP to describe separate follow-on votes for changing the default, and removing the scanning behavior, as that was my original intent but I don't think that was made clear. I also pushed back the deprecation of the migration sub-commands, since we'd like to usher people towards the migration script instead of relying on the HYBRID modes. I kept the decision for deprecating the non-SERVICE_LOAD modes in the current vote for now, but if introducing deprecated configurations is too unpalatable we can move the deprecation to a later vote. Thanks, Greg On Mon, Jan 23, 2023 at 12:49 PM Chris Egerton <chr...@aiven.io.invalid> wrote: > Hi Greg, > > Thanks for the KIP! This part of Connect has been due for an update for a > while, it's nice to see it getting some attention. I especially like how > the migration plan is broken down into affected personas; makes review > easier and, hopefully, makes the feature easier to understand for affected > users. > > Here are my initial thoughts: > > 1. Will workers' resiliency to packaging issues be affected when the new > scanning behavior is enabled? For example, will a single poorly-packaged > plugin prevent other correctly-packaged plugins from being loaded, or > worse, cause the worker to fail? And either way, are there any details that > we can call out to back up any guarantees we want to provide on that front? > > 2. I think the description of the "--plugin-location" flag may need some > updating? In the middle of the list of bullet points there's "The value of > this argument will be a single plugin, which can be any of the following:" > but the following points don't appear to be related. > > 3. (Nit) Could we rename the migration script something like > "connect-migrate-plugin-path" or even just "connect-plugin-path" with a > "migrate" subcommand? > > 4. It's noted that "If a plugin class is removed from the path, the > corresponding shim manifest should also be removed". Did you consider > adding behavior (possibly guarded behind a CLI flag) to find and remove > manifests for nonexistent plugins? > > 5. What's the concrete plan for updating the plugins that ship OOTB with > Connect? Will they be given a service loader manifest, a module info file, > or both? > > 6. In the HYBRID_WARN mode, what precedence will we give to plugins that > are found by both the legacy scanning logic and the new service > loader/module info loading logic? > > 7. It's a bit strange to immediately deprecate config property values and > the migration script in their first release. IMO as long as we have the > migration script in the project, it doesn't make sense to deprecate it, and > until we've had at least one successful release with the new loading logic > and some confidence in it, we should not deprecate the older logic. I think > a more reasonable pace for the update here could be to deprecate the legacy > scanning logic in 4.0 (changing the default accordingly) and remove it > entirely in 5.0. > > Cheers, > > Chris > > On Mon, Jan 23, 2023 at 3:21 AM Federico Valeri <fedeval...@gmail.com> > wrote: > > > Hi Greg, this looks like a useful change to me. > > > > I was just wondering if we should use a better script name like > > "connect-convert-to-service-provider.sh" or something like this, and > > maybe add a --dry-run option. > > > > On Tue, Jan 17, 2023 at 8:45 PM Greg Harris > > <greg.har...@aiven.io.invalid> wrote: > > > > > > Hi all! > > > > > > I'd like to start a discussion about > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-898%3A+Modernize+Connect+plugin+discovery > > > . > > > > > > Thanks! > > > Greg Harris > > >