Hi,

I've recently been experimenting with setting the values of the `offset,`
`storage` and `status` topics within Kafka Connect.

I'm aware from various sources (Robin Moffatt blogs, StackOverflow,
Confluent Kafka Connect docs) that these topics should not be shared across
different connect **clusters**.  e.g for each  unique set of workers with a
given `group.id`, a unique set of internal storage topics should be used.

These discussions and documentations usually talk about sharing all three
topics at once, however, I am interested in reusing only the offset storage
topic. I am struggling to find the risks of sharing this offset topic
between different connect clusters.

I'm aware of issues with sharing the config and status topics from blogs
and my own testing (clusters can end up running connectors from other
clusters, for example), but I cannot find a case for not sharing the offset
topic despite guidance to avoid this.

The use cases I am interested in are:

1. Sharing an offset topic between clusters, but never in parallel.


*e.g cluster 1 running connector A uses the offset topic, cluster 1 and
connector A are deleted, then cluster 2 running connector B is created uses
the offset topic. *

2. As above, but using the offset topic in parallel.

As the offset.stroage topic is keyed by connector name (from the source
connectors I've tried) I do not understand the risk of both of the above
cases **unless** > 1  connector exists with the same name in separate
clusters, as there would then be the risk of key collision as group.id is
not referenced in the offset topic keys.

Any insights into why sharing the offset topic between clusters for the
cases described would be greatly appreciated, thank you.

Reply via email to