Yes, I think that makes sense. As I see it, the tradeoffs are:

1. Complexity - adding these APIs increases the number of things connector
developers need to implement just to get started. In a lot of cases, the
first version of a connector might literally only have some connection
string as a setting and that setting will never be removed/changed. So it
would be, in some sense, unnecessary added complexity.
2. Relies on correct modification of version number by connector developer
- of course any compatibility relies on the connector developer noting
incompatibilities and handling them properly, but baking versioning into
the APIs suggests we're guaranteeing compatibility. However, we know from
experience that this is a lot harder than just adding version numbers.
Seemingly small changes may change the semantics but not the format of the
config such that there is an unexpected incompatibility.
3. Force developers to think about compatibility up front - I think this is
the main argument for this approach. By including those APIs, it's a big
red flag to connector developers that they need to think about different
versions of their config and how they will handle any changes.

I'm ok with including the versioning or not. I don't feel too strongly
because I think including the versions is really a band-aid, not a real
solution, to the compatibility problem. (I don't think there is a solution
for it; it's just a really hard problem...). I am wary of adding more
methods to the APIs because keeping the connector APIs as simple as
possible is really important for getting non-Kafka devs to develop
connectors.

By the way, this probably extends to the Tasks as well -- they have the
same basic configuration compatibility problems as Connectors do. So I
think adopting this approach implies that we'd do the same to the Task
interfaces.


On Thu, Aug 13, 2015 at 10:16 PM, Gwen Shapira <g...@confluent.io> wrote:

> Hi Team Kafka,
>
> This may be a slightly premature discussion, but forgetting about upgrades
> is a newbie mistake I'd like us to avoid :)
>
> So, we have connector-plugins, and users use them to create
> connector-instances. This requires some configuration - database connection
> string, HDFS namenode, etc.
>
> Now suppose we have a SystemX connector-plugin, and it takes just
> "connection string" as argument. And people happily use it. Now imagine
> that the guy who wrong the plugin wants to release SystemX-plugin-2.0.
>
> Ideally, we'd want users to be able to drop 2.0 jar instead of 2.0 and
> restart their connector-instances and keep on running with the existing
> configuration.
>
> But what if SystemX-plugin-2.0 made changes to the configuration? What if
> it now has a new mandatory parameter? Or if the connection string format
> changed?
>
> I'd like to give connector developers a way to upgrade existing
> configuration when they release a new version.
>
> My proposal:
> 1. Connector API now includes 2 new methods - int getVersion() and
> configuration upgrade(configuration)
> 2. When the framework persists configuration for the connector (I'm talking
> mostly about cluster mode where we want to keep the configuration in
> Kafka), it also persists the version #.
> 3. When starting a connector-instance, if the version from getVersion()
> doesn't match the version in the persisted configs, the framework will call
> upgrade() with existing configs so the connector can upgrade them and
> return the new configs which will then be persisted with the new version.
> 4. If the upgrade fails, the connector-instance will not run.
>
> Does that make sense?
>
> Gwen
>



-- 
Thanks,
Ewen

Reply via email to