Hi Greg,

Many thanks for the KIP. Here are a few initial questions

1. Incomplete sentence: "But Connect is not situated to be able to manage
resources directly, as workers are given a fixed "
2. You explain how sessioned is now a subset of static, but what happens in
a cluster where some workers are using static and some are using either
eager or compatible?
3. "Assign each unassigned static job to a static worker which specifies
that job, choosing arbitrarily if there are multiple valid workers." I
think there might be less ambiguous words than "arbitrary" to specify this
behaviour. Hashing the task name would _appear_ pretty arbitrary to the
user, but would be deterministic. Picking at random would be
non-deterministic. Even better if you have a rationale.
4. You don't describe how a user, or automated system, starting with a set
of connectors, should find out the tasks that they want to run. This
relates to the contract of
org.apache.kafka.connect.connector.Connector#taskConfigs(int maxTasks).
AFAIK (please correct me if I'm wrong, because it's a long time since I
looked at this code) there's nothing that validates that the returned list
has at most the `maxTasks` and connectors can, of course, return fewer than
that many tasks. So without deeper knowledge of a particular connector it's
not clear to the user/operator how to configure their static workers and
static assignments.
5. Is there a lurking assumption that task indices are stable? E.g. that
the task with index 3 will always be the resource-intensive one. I can see
that that would often be a reliable assumption, but it's not clear to me
that it is always the case.

Thanks again,

Tom

On Fri, 6 Oct 2023 at 12:36, Greg Harris <greg.har...@aiven.io.invalid>
wrote:

> Hey everyone!
>
> I'd like to propose an improvement to Kafka Connect's scheduling
> algorithm, with the goal of improving the operability of connect
> clusters through resource isolation.
>
> Find the KIP here:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-987%3A+Connect+Static+Assignments
>
> This feature is primarily intended to be consumed by cluster
> management systems, so I've opened a sister proposal in the Strimzi
> project that uses this feature to provide user-facing resource
> isolation. The base feature is generic, so it is usable manually and
> with cluster management systems other than Strimzi.
>
> Find the proposal here: https://github.com/strimzi/proposals/pull/96
>
> Thanks!
> Greg Harris
>
>

Reply via email to