I think this is a great area to think about. What about the simpler informal scheme of just having cc ignore config it doesn't know about and instructing connector developers to maintain compatibility when renaming params (i.e. support both old and new) as we do for Kafka config.
The steps would then be: 1. CC connector developer adds new config support, possibly deprecating older configs but not removing them. 2. CC user then upgrades their config, then the connector implementation 3. CC user can always roll back if this doesn't work in which case the new configs added are ignored If you want to totally refactor your configs in a way where compatibility is just not doable then you should rename the connector. This is one where a system test that tests last release versus this release would probably do a lot to help ensure this stuff actually worked as expected. Not sure if this is better or worse than the hard versioning scheme. -Jay On Fri, Aug 14, 2015 at 12:52 AM, Ewen Cheslack-Postava <e...@confluent.io> wrote: > Yes, I think that makes sense. As I see it, the tradeoffs are: > > 1. Complexity - adding these APIs increases the number of things connector > developers need to implement just to get started. In a lot of cases, the > first version of a connector might literally only have some connection > string as a setting and that setting will never be removed/changed. So it > would be, in some sense, unnecessary added complexity. > 2. Relies on correct modification of version number by connector developer > - of course any compatibility relies on the connector developer noting > incompatibilities and handling them properly, but baking versioning into > the APIs suggests we're guaranteeing compatibility. However, we know from > experience that this is a lot harder than just adding version numbers. > Seemingly small changes may change the semantics but not the format of the > config such that there is an unexpected incompatibility. > 3. Force developers to think about compatibility up front - I think this is > the main argument for this approach. By including those APIs, it's a big > red flag to connector developers that they need to think about different > versions of their config and how they will handle any changes. > > I'm ok with including the versioning or not. I don't feel too strongly > because I think including the versions is really a band-aid, not a real > solution, to the compatibility problem. (I don't think there is a solution > for it; it's just a really hard problem...). I am wary of adding more > methods to the APIs because keeping the connector APIs as simple as > possible is really important for getting non-Kafka devs to develop > connectors. > > By the way, this probably extends to the Tasks as well -- they have the > same basic configuration compatibility problems as Connectors do. So I > think adopting this approach implies that we'd do the same to the Task > interfaces. > > > On Thu, Aug 13, 2015 at 10:16 PM, Gwen Shapira <g...@confluent.io> wrote: > > > Hi Team Kafka, > > > > This may be a slightly premature discussion, but forgetting about > upgrades > > is a newbie mistake I'd like us to avoid :) > > > > So, we have connector-plugins, and users use them to create > > connector-instances. This requires some configuration - database > connection > > string, HDFS namenode, etc. > > > > Now suppose we have a SystemX connector-plugin, and it takes just > > "connection string" as argument. And people happily use it. Now imagine > > that the guy who wrong the plugin wants to release SystemX-plugin-2.0. > > > > Ideally, we'd want users to be able to drop 2.0 jar instead of 2.0 and > > restart their connector-instances and keep on running with the existing > > configuration. > > > > But what if SystemX-plugin-2.0 made changes to the configuration? What if > > it now has a new mandatory parameter? Or if the connection string format > > changed? > > > > I'd like to give connector developers a way to upgrade existing > > configuration when they release a new version. > > > > My proposal: > > 1. Connector API now includes 2 new methods - int getVersion() and > > configuration upgrade(configuration) > > 2. When the framework persists configuration for the connector (I'm > talking > > mostly about cluster mode where we want to keep the configuration in > > Kafka), it also persists the version #. > > 3. When starting a connector-instance, if the version from getVersion() > > doesn't match the version in the persisted configs, the framework will > call > > upgrade() with existing configs so the connector can upgrade them and > > return the new configs which will then be persisted with the new version. > > 4. If the upgrade fails, the connector-instance will not run. > > > > Does that make sense? > > > > Gwen > > > > > > -- > Thanks, > Ewen >