Totally agree with the priorities, and I'm actually not even advocating the
alternative I through out. I think, ultimately, the connector developer is
the one who has to support compatibility, though, irrespective of the
mechanism. So I guess it is really just a question of whether it is easier
for people to work with explicit versioning or the kind of loosy goosy
mechanism I described. I don't know the answer to that question. Internally
Kafka and related things have always used explicit versioning and I kind of
prefer that too, but I have observed that people struggle with it.

-Jay

On Mon, Aug 17, 2015 at 9:21 AM, Gwen Shapira <g...@confluent.io> wrote:

> I see our "make life easy" priorities as follows: users > connector
> developers > core copycat developers.
> In this case I think that best user experience will be consistent behavior
> for all copycat connectors, so they'll know what to expect during upgrades.
>
> I'm not sure how much we can enforce from the framework side, but I suspect
> we can do better than "advise connector developers to ignore unknown
> configs"... even if that means making the framework or even connector
> development slightly more difficult.
>
>
> On Mon, Aug 17, 2015 at 9:08 AM, Jay Kreps <j...@confluent.io> wrote:
>
> > Yeah I meant this purely as an approach for the connectors themselves to
> > implement not that we would automatically suppress configs in the
> > framework.
> >
> > -Jay
> >
> > On Fri, Aug 14, 2015 at 7:48 PM, Gwen Shapira <g...@confluent.io> wrote:
> >
> > > The "ignore new configs" plan is good, IMO. I just don't know if its
> > > feasible in current Copycat:
> > > I'm not sure CC can ignore configuration on the connector behalf. Also,
> > > this is something we will want to log very clearly and I'm not sure
> this
> > is
> > > doable outside the connector.
> > >
> > > Regarding complex implementation for connector developers. All
> version1.0
> > > connectors will have the following:
> > >
> > > int getVersion() {
> > >     return 1;
> > > }
> > >
> > > Configuration upgradeConfig(Configuration config) {
> > >     return config;
> > > }
> > >
> > > Not that bad :) We can even have a base class to avoid boilerplate.
> > >
> > > The complexity shows up when upgradeConfig has to actually upgrade
> > configs,
> > > but thats our whole point, IMO.
> > >
> > > Gwen
> > >
> > > On Fri, Aug 14, 2015 at 10:48 AM, Jay Kreps <j...@confluent.io> wrote:
> > >
> > > > I think this is a great area to think about.
> > > >
> > > > What about the simpler informal scheme of just having cc ignore
> config
> > it
> > > > doesn't know about and instructing connector developers to maintain
> > > > compatibility when renaming params (i.e. support both old and new) as
> > we
> > > do
> > > > for Kafka config.
> > > >
> > > > The steps would then be:
> > > > 1. CC connector developer adds new config support, possibly
> deprecating
> > > > older configs but not removing them.
> > > > 2. CC user then upgrades their config, then the connector
> > implementation
> > > > 3. CC user can always roll back if this doesn't work in which case
> the
> > > new
> > > > configs added are ignored
> > > >
> > > > If you want to totally refactor your configs in a way where
> > compatibility
> > > > is just not doable then you should rename the connector.
> > > >
> > > > This is one where a system test that tests last release versus this
> > > release
> > > > would probably do a lot to help ensure this stuff actually worked as
> > > > expected.
> > > >
> > > > Not sure if this is better or worse than the hard versioning scheme.
> > > >
> > > > -Jay
> > > >
> > > > On Fri, Aug 14, 2015 at 12:52 AM, Ewen Cheslack-Postava <
> > > e...@confluent.io
> > > > >
> > > > wrote:
> > > >
> > > > > Yes, I think that makes sense. As I see it, the tradeoffs are:
> > > > >
> > > > > 1. Complexity - adding these APIs increases the number of things
> > > > connector
> > > > > developers need to implement just to get started. In a lot of
> cases,
> > > the
> > > > > first version of a connector might literally only have some
> > connection
> > > > > string as a setting and that setting will never be removed/changed.
> > So
> > > it
> > > > > would be, in some sense, unnecessary added complexity.
> > > > > 2. Relies on correct modification of version number by connector
> > > > developer
> > > > > - of course any compatibility relies on the connector developer
> > noting
> > > > > incompatibilities and handling them properly, but baking versioning
> > > into
> > > > > the APIs suggests we're guaranteeing compatibility. However, we
> know
> > > from
> > > > > experience that this is a lot harder than just adding version
> > numbers.
> > > > > Seemingly small changes may change the semantics but not the format
> > of
> > > > the
> > > > > config such that there is an unexpected incompatibility.
> > > > > 3. Force developers to think about compatibility up front - I think
> > > this
> > > > is
> > > > > the main argument for this approach. By including those APIs, it's
> a
> > > big
> > > > > red flag to connector developers that they need to think about
> > > different
> > > > > versions of their config and how they will handle any changes.
> > > > >
> > > > > I'm ok with including the versioning or not. I don't feel too
> > strongly
> > > > > because I think including the versions is really a band-aid, not a
> > real
> > > > > solution, to the compatibility problem. (I don't think there is a
> > > > solution
> > > > > for it; it's just a really hard problem...). I am wary of adding
> more
> > > > > methods to the APIs because keeping the connector APIs as simple as
> > > > > possible is really important for getting non-Kafka devs to develop
> > > > > connectors.
> > > > >
> > > > > By the way, this probably extends to the Tasks as well -- they have
> > the
> > > > > same basic configuration compatibility problems as Connectors do.
> So
> > I
> > > > > think adopting this approach implies that we'd do the same to the
> > Task
> > > > > interfaces.
> > > > >
> > > > >
> > > > > On Thu, Aug 13, 2015 at 10:16 PM, Gwen Shapira <g...@confluent.io>
> > > > wrote:
> > > > >
> > > > > > Hi Team Kafka,
> > > > > >
> > > > > > This may be a slightly premature discussion, but forgetting about
> > > > > upgrades
> > > > > > is a newbie mistake I'd like us to avoid :)
> > > > > >
> > > > > > So, we have connector-plugins, and users use them to create
> > > > > > connector-instances. This requires some configuration - database
> > > > > connection
> > > > > > string, HDFS namenode, etc.
> > > > > >
> > > > > > Now suppose we have a SystemX connector-plugin, and it takes just
> > > > > > "connection string" as argument. And people happily use it. Now
> > > imagine
> > > > > > that the guy who wrong the plugin wants to release
> > > SystemX-plugin-2.0.
> > > > > >
> > > > > > Ideally, we'd want users to be able to drop 2.0 jar instead of
> 2.0
> > > and
> > > > > > restart their connector-instances and keep on running with the
> > > existing
> > > > > > configuration.
> > > > > >
> > > > > > But what if SystemX-plugin-2.0 made changes to the configuration?
> > > What
> > > > if
> > > > > > it now has a new mandatory parameter? Or if the connection string
> > > > format
> > > > > > changed?
> > > > > >
> > > > > > I'd like to give connector developers a way to upgrade existing
> > > > > > configuration when they release a new version.
> > > > > >
> > > > > > My proposal:
> > > > > > 1. Connector API now includes 2 new methods - int getVersion()
> and
> > > > > > configuration upgrade(configuration)
> > > > > > 2. When the framework persists configuration for the connector
> (I'm
> > > > > talking
> > > > > > mostly about cluster mode where we want to keep the configuration
> > in
> > > > > > Kafka), it also persists the version #.
> > > > > > 3. When starting a connector-instance, if the version from
> > > getVersion()
> > > > > > doesn't match the version in the persisted configs, the framework
> > > will
> > > > > call
> > > > > > upgrade() with existing configs so the connector can upgrade them
> > and
> > > > > > return the new configs which will then be persisted with the new
> > > > version.
> > > > > > 4. If the upgrade fails, the connector-instance will not run.
> > > > > >
> > > > > > Does that make sense?
> > > > > >
> > > > > > Gwen
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Thanks,
> > > > > Ewen
> > > > >
> > > >
> > >
> >
>

Reply via email to