Hello,

We are trying to create a streaming pipeline of data between different
Kafka clusters. Our users send data to the input Kafka cluster, and we want
to process this data and send the result to topics on another Kafka cluster.

We have different reasons for this setup, but mainly it's for isolation:
the two clusters don't have to have the same configuration and the first
"input" Kafka cluster is critical: we want to be able to do maintenance on
the second cluster without impacting the first one. Also we have more than
a thousand topics on each side so managing them separately is easier.

We are investigating different technologies for the processing part, and
Kafka Streams looked promising except it is apparently not supporting to
write in a different cluster as the one it is reading from.

I saw people on forums suggesting to write in the first cluster and use
MirrorMaker to channel the data to the output cluster. This breaks our
isolation requirements and add more latency so we don't want to do that.

I have two questions:

- Is there a reason behind the constraint that Kafka Streams can not
produce to a different cluster? I see that Kafka Streams allow to specify
different configuration for the producer but it explicitly disallow it for
ProducerConfig.BOOTSTRAP_SERVERS_CONFIG so it definitely something the
developers did not want to support (
https://kafka.apache.org/20/javadoc/org/apache/kafka/streams/StreamsConfig.html#getMainConsumerConfigs-java.lang.String-java.lang.String-)
but I am not clear why it is so.

- At the same time, there is the KafkaClientSupplier mechanism that allows
to inject our own KafkaProducer. I was actually successful in injecting
such a KafkaProducer that connects to a different cluster. The fact that I
am able to do, using a not-very documented API, something that other parts
of the Kafka Streams library try to prevent me to do, makes me wonder if I
am breaking something while doing this? In particular one thing important
to me is exactly-once processing so I want to be sure it would still work.

Thanks,
Cyrille

Reply via email to