Hi Yash,

Thanks for the KIP. It's a nice, focused change. Initially I was hesitant
to support cases where connector validation takes this long, but
considering the alternative is that we give users a 500 error response but
leave the request to create/modify the connector queued up in the herder, I
think I can get behind the motivation here. There's also an argument to be
made about keeping Kafka Connect available even when the systems that it
connects to are in a degraded state.

I have a few alternatives I'd be interested in your thoughts on:

1. Since the primary concern here seems to be that custom connector
validation logic can take too long, do we have any thoughts on adding logic
to check for request timeout after validation has completed and, if it has,
aborting the attempt to create/modify the connector?

2. Right now it's possible that we'll perform two connector config
validations per create/modify request; once on the worker that initially
receives the request, and then again if that worker is not the leader of
the cluster and has to forward the request to the leader. Any thoughts on
optimizing this to only require a single validation per request? We
probably wouldn't want to force all validations to take place on the leader
(could lead to overloading it pretty quickly in large clusters), but we
could add an internal-only query parameter to skip validation and then use
that parameter when forwarding requests from followers to the leader.

3. A worker property is pretty coarse-grained, and difficult to change. We
might allow per-request toggling of the timeout by adding a URL query
parameter like '?timeout=90s' to the REST API to allow tweaking of the
timeout on a more granular basis, and without having to perform a worker
restart.

I'd also like to clarify a point about the rejected alternative "Allow
configuring producer zombie fencing admin request timeout"--is the
implication here that the "rest.api.request.timeout.ms" property will not
control the REST timeout for requests to the 'PUT
/connectors/{connector}/fence' endpoint, or just that it won't control the
timeout that we use for the call to Admin::fenceProducers?

Cheers,

Chris

On Wed, Nov 2, 2022 at 10:07 AM Yash Mayya <yash.ma...@gmail.com> wrote:

> Hi all,
>
> I'd like to start a discussion thread on this small KIP -
>
> https://eu01.z.antigena.com/l/8mqVuSEhdZF05LKGk7J1Cb6esUWht6ps4-73ocd196fgyxC~lGhTLqBEt9BPJy1MBx6RzKVi7qRYYpJwCHRixlmdcayj7YMv~-0fIZJ2f8EjEhp-Qck3~O3Ga1JBTgaY4tTAlwh0VjF6217iBaGVG2PNa-2TO5z8zC5s0Nb28kcG2mUg1tcEfzsAtmWEXWRECoeD4cdUXyPvWeFtYL-ccVv-PmQydgSIYkdeR7~XItAXezft9Mfs5pzjejXSK~MhWW
>
> It proposes the addition of a new Kafka Connect worker configuration to
> allow configuring REST API request timeouts.
>
> Thanks,
> Yash
>

Reply via email to