Re: Streams - multiple clusters support

Ryanne Dolan Thu, 13 Feb 2020 09:16:28 -0800

Cyrille, I don't see why using MM1/2 would break your isolation
requirement. But if you can't mirror topics for some reason consider Flink
instead of Kafka Streams.


Ryanne

On Thu, Feb 13, 2020 at 10:52 AM Cyrille Karmann <[email protected]>
wrote:

> Hello,
>
> We are trying to create a streaming pipeline of data between different
> Kafka clusters. Our users send data to the input Kafka cluster, and we want
> to process this data and send the result to topics on another Kafka
> cluster.
>
> We have different reasons for this setup, but mainly it's for isolation:
> the two clusters don't have to have the same configuration and the first
> "input" Kafka cluster is critical: we want to be able to do maintenance on
> the second cluster without impacting the first one. Also we have more than
> a thousand topics on each side so managing them separately is easier.
>
> We are investigating different technologies for the processing part, and
> Kafka Streams looked promising except it is apparently not supporting to
> write in a different cluster as the one it is reading from.
>
> I saw people on forums suggesting to write in the first cluster and use
> MirrorMaker to channel the data to the output cluster. This breaks our
> isolation requirements and add more latency so we don't want to do that.
>
> I have two questions:
>
> - Is there a reason behind the constraint that Kafka Streams can not
> produce to a different cluster? I see that Kafka Streams allow to specify
> different configuration for the producer but it explicitly disallow it for
> ProducerConfig.BOOTSTRAP_SERVERS_CONFIG so it definitely something the
> developers did not want to support (
>
> https://kafka.apache.org/20/javadoc/org/apache/kafka/streams/StreamsConfig.html#getMainConsumerConfigs-java.lang.String-java.lang.String-
> )
> but I am not clear why it is so.
>
> - At the same time, there is the KafkaClientSupplier mechanism that allows
> to inject our own KafkaProducer. I was actually successful in injecting
> such a KafkaProducer that connects to a different cluster. The fact that I
> am able to do, using a not-very documented API, something that other parts
> of the Kafka Streams library try to prevent me to do, makes me wonder if I
> am breaking something while doing this? In particular one thing important
> to me is exactly-once processing so I want to be sure it would still work.
>
> Thanks,
> Cyrille
>

Re: Streams - multiple clusters support

Reply via email to