Totally agree with the priorities, and I'm actually not even advocating the alternative I through out. I think, ultimately, the connector developer is the one who has to support compatibility, though, irrespective of the mechanism. So I guess it is really just a question of whether it is easier for people to work with explicit versioning or the kind of loosy goosy mechanism I described. I don't know the answer to that question. Internally Kafka and related things have always used explicit versioning and I kind of prefer that too, but I have observed that people struggle with it.
-Jay On Mon, Aug 17, 2015 at 9:21 AM, Gwen Shapira <g...@confluent.io> wrote: > I see our "make life easy" priorities as follows: users > connector > developers > core copycat developers. > In this case I think that best user experience will be consistent behavior > for all copycat connectors, so they'll know what to expect during upgrades. > > I'm not sure how much we can enforce from the framework side, but I suspect > we can do better than "advise connector developers to ignore unknown > configs"... even if that means making the framework or even connector > development slightly more difficult. > > > On Mon, Aug 17, 2015 at 9:08 AM, Jay Kreps <j...@confluent.io> wrote: > > > Yeah I meant this purely as an approach for the connectors themselves to > > implement not that we would automatically suppress configs in the > > framework. > > > > -Jay > > > > On Fri, Aug 14, 2015 at 7:48 PM, Gwen Shapira <g...@confluent.io> wrote: > > > > > The "ignore new configs" plan is good, IMO. I just don't know if its > > > feasible in current Copycat: > > > I'm not sure CC can ignore configuration on the connector behalf. Also, > > > this is something we will want to log very clearly and I'm not sure > this > > is > > > doable outside the connector. > > > > > > Regarding complex implementation for connector developers. All > version1.0 > > > connectors will have the following: > > > > > > int getVersion() { > > > return 1; > > > } > > > > > > Configuration upgradeConfig(Configuration config) { > > > return config; > > > } > > > > > > Not that bad :) We can even have a base class to avoid boilerplate. > > > > > > The complexity shows up when upgradeConfig has to actually upgrade > > configs, > > > but thats our whole point, IMO. > > > > > > Gwen > > > > > > On Fri, Aug 14, 2015 at 10:48 AM, Jay Kreps <j...@confluent.io> wrote: > > > > > > > I think this is a great area to think about. > > > > > > > > What about the simpler informal scheme of just having cc ignore > config > > it > > > > doesn't know about and instructing connector developers to maintain > > > > compatibility when renaming params (i.e. support both old and new) as > > we > > > do > > > > for Kafka config. > > > > > > > > The steps would then be: > > > > 1. CC connector developer adds new config support, possibly > deprecating > > > > older configs but not removing them. > > > > 2. CC user then upgrades their config, then the connector > > implementation > > > > 3. CC user can always roll back if this doesn't work in which case > the > > > new > > > > configs added are ignored > > > > > > > > If you want to totally refactor your configs in a way where > > compatibility > > > > is just not doable then you should rename the connector. > > > > > > > > This is one where a system test that tests last release versus this > > > release > > > > would probably do a lot to help ensure this stuff actually worked as > > > > expected. > > > > > > > > Not sure if this is better or worse than the hard versioning scheme. > > > > > > > > -Jay > > > > > > > > On Fri, Aug 14, 2015 at 12:52 AM, Ewen Cheslack-Postava < > > > e...@confluent.io > > > > > > > > > wrote: > > > > > > > > > Yes, I think that makes sense. As I see it, the tradeoffs are: > > > > > > > > > > 1. Complexity - adding these APIs increases the number of things > > > > connector > > > > > developers need to implement just to get started. In a lot of > cases, > > > the > > > > > first version of a connector might literally only have some > > connection > > > > > string as a setting and that setting will never be removed/changed. > > So > > > it > > > > > would be, in some sense, unnecessary added complexity. > > > > > 2. Relies on correct modification of version number by connector > > > > developer > > > > > - of course any compatibility relies on the connector developer > > noting > > > > > incompatibilities and handling them properly, but baking versioning > > > into > > > > > the APIs suggests we're guaranteeing compatibility. However, we > know > > > from > > > > > experience that this is a lot harder than just adding version > > numbers. > > > > > Seemingly small changes may change the semantics but not the format > > of > > > > the > > > > > config such that there is an unexpected incompatibility. > > > > > 3. Force developers to think about compatibility up front - I think > > > this > > > > is > > > > > the main argument for this approach. By including those APIs, it's > a > > > big > > > > > red flag to connector developers that they need to think about > > > different > > > > > versions of their config and how they will handle any changes. > > > > > > > > > > I'm ok with including the versioning or not. I don't feel too > > strongly > > > > > because I think including the versions is really a band-aid, not a > > real > > > > > solution, to the compatibility problem. (I don't think there is a > > > > solution > > > > > for it; it's just a really hard problem...). I am wary of adding > more > > > > > methods to the APIs because keeping the connector APIs as simple as > > > > > possible is really important for getting non-Kafka devs to develop > > > > > connectors. > > > > > > > > > > By the way, this probably extends to the Tasks as well -- they have > > the > > > > > same basic configuration compatibility problems as Connectors do. > So > > I > > > > > think adopting this approach implies that we'd do the same to the > > Task > > > > > interfaces. > > > > > > > > > > > > > > > On Thu, Aug 13, 2015 at 10:16 PM, Gwen Shapira <g...@confluent.io> > > > > wrote: > > > > > > > > > > > Hi Team Kafka, > > > > > > > > > > > > This may be a slightly premature discussion, but forgetting about > > > > > upgrades > > > > > > is a newbie mistake I'd like us to avoid :) > > > > > > > > > > > > So, we have connector-plugins, and users use them to create > > > > > > connector-instances. This requires some configuration - database > > > > > connection > > > > > > string, HDFS namenode, etc. > > > > > > > > > > > > Now suppose we have a SystemX connector-plugin, and it takes just > > > > > > "connection string" as argument. And people happily use it. Now > > > imagine > > > > > > that the guy who wrong the plugin wants to release > > > SystemX-plugin-2.0. > > > > > > > > > > > > Ideally, we'd want users to be able to drop 2.0 jar instead of > 2.0 > > > and > > > > > > restart their connector-instances and keep on running with the > > > existing > > > > > > configuration. > > > > > > > > > > > > But what if SystemX-plugin-2.0 made changes to the configuration? > > > What > > > > if > > > > > > it now has a new mandatory parameter? Or if the connection string > > > > format > > > > > > changed? > > > > > > > > > > > > I'd like to give connector developers a way to upgrade existing > > > > > > configuration when they release a new version. > > > > > > > > > > > > My proposal: > > > > > > 1. Connector API now includes 2 new methods - int getVersion() > and > > > > > > configuration upgrade(configuration) > > > > > > 2. When the framework persists configuration for the connector > (I'm > > > > > talking > > > > > > mostly about cluster mode where we want to keep the configuration > > in > > > > > > Kafka), it also persists the version #. > > > > > > 3. When starting a connector-instance, if the version from > > > getVersion() > > > > > > doesn't match the version in the persisted configs, the framework > > > will > > > > > call > > > > > > upgrade() with existing configs so the connector can upgrade them > > and > > > > > > return the new configs which will then be persisted with the new > > > > version. > > > > > > 4. If the upgrade fails, the connector-instance will not run. > > > > > > > > > > > > Does that make sense? > > > > > > > > > > > > Gwen > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Thanks, > > > > > Ewen > > > > > > > > > > > > > > >