Re: [DISCUSS] KIP-454: Expansion of the ConnectClusterState interface

Konstantine Karantasis Fri, 19 Apr 2019 11:25:26 -0700

Resending an email I sent this Monday but didn't make it to the mailing
list:


----

Hi Chris,

thanks for the KIP!
I have the following initial comments:

1. In its current form, the code snippet gives the impression that this is
a new interface or an interface that completely replaces the existing one.
It's not clear that the interface is extended. You think we could represent
this extension in a better way? (I'm aware that in the text you say that
these methods are additional, but the code block gives a partial view of
this interface).

2. In the compatibility page it'd be nice to read which version this
feature is targeting. Now, given that KIP-285 was introduced in 2.0.0, I
wonder if it'd make sense to have default methods for the new interface
methods that you suggest adding.

3. Any reason why ConnectorTaskId is not used instead of Integer as key
type in taskConfigs? This class is already part of the connect-api package
and I'd imagine reusing it might allow for fewer transformations between
task config maps currently used and the new ones that will be returned by
this interface method.

4. Finally, there's a mention on the interface javadoc about how these
configs are retrieved using the Connect herder, but it's not clear whether
the leader of the workers' group will be queried or not for this
information. I think a paragraph describing what are the assumptions with
respect to what consists a "current" task configuration and how this is
retrieved would be valuable here.

Best,
Konstantine



On Thu, Apr 18, 2019 at 2:45 PM Chris Egerton <[email protected]> wrote:

> Hi Magesh,
>
> Thanks for your comments. I'll address them in the order you provided them:
>
> 1 - Reason for exposing task configurations to REST extensions:
> Yes, the motivation is a little thin for exposing task configs to REST
> extensions. I can think of a few uses for this functionality, such as
> attempting to infer problematic configurations by examining failed tasks
> and comparing their configurations to the configurations of running tasks,
> but like you've indicated it's dubious that the best place for anything
> like that belongs in a REST extension.
> I'd be interested to hear others' thoughts, but right now I'm not too
> opposed to erring on the side of caution and leaving it out. Worst case, it
> takes another KIP to add this later on down the road, but that's a small
> price to pay to avoid adding support for a feature that nobody needs.
>
> 2. Usefulness of exposing Kafka cluster ID to REST extensions:
> As the KIP states, "the Kafka cluster ID may be useful for the purpose of
> uniquely identifying a Connect cluster from within a REST extension, since
> users may be running multiple Kafka clusters and the group.id for a
> distributed Connect cluster may not be sufficient to identify a cluster."
> Even though there may be producer or consumer overrides for
> bootstrap.servers present in the configuration for the worker, these will
> not affect which Kafka cluster is used as a backing store for connector
> configurations, offsets, and statuses, so the Kafka cluster ID for the
> worker in conjunction with the Connect group ID should be sufficient to
> uniquely identify a Connect cluster.
> We can and should document that the Connect cluster with overridden
> producer.bootstrap.servers or consumer.bootstrap.servers may be writing
> to/reading from a different Kafka cluster. However, REST extensions are
> already passed the configs for their worker through their configure(...)
> method, so they'd be able to detect any such overrides and act accordingly.
>
> Thanks again for your thoughts!
>
> Cheers,
>
> Chris
>
> On Thu, Apr 18, 2019 at 11:08 AM Magesh Nandakumar <[email protected]>
> wrote:
>
> > Hi Chris,
> >
> > Thanks for the KIP. Overall, it looks good and straightforward to me.
> >
> > I had a few questions on the new methods
> >
> > 1. I'm not sure if an extension would ever require the task configs. An
> > extension generally should only require the health and current state of
> the
> > connector which includes the connector config. I was wondering if there
> is
> > a specific reason it would need task configs.
> > 2. Also, I'm not convinced that kafkaClusterId() belongs to the
> > ConnectClusterState
> > interface. The interface is generally to provide information about the
> > Connect cluster and its information.  Also, the kafkaClusterId could
> > potentially change based on whether there is a "producer." or "consumer."
> > prefix, right?
> >
> > Having said that, I would prefer to have connectorConfigs which I think
> is
> > a great idea and addition to the interface. Let me know what you think.
> >
> > Thanks,
> > Magesh
> >
> > On Sat, Apr 13, 2019 at 9:00 PM Chris Egerton <[email protected]>
> wrote:
> >
> > > Hi all,
> > >
> > > I've posted "KIP-454: Expansion of the ConnectClusterState interface",
> > > which proposes that we add provide more information about the Connect
> > > cluster to REST extensions.
> > >
> > > The KIP can be found at
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-454%3A+Expansion+of+the+ConnectClusterState+interface
> > >
> > > I'm eager to hear people's thoughts on this and will appreciate any
> > > feedback.
> > >
> > > Cheers,
> > >
> > > Chris
> > >
> >
>

Re: [DISCUSS] KIP-454: Expansion of the ConnectClusterState interface

Reply via email to