-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 There are two reasons (some would be easier to address than others):
1) a client can only connect to one cluster; to allow data repartitioning the producer must write the repartition data into the input cluster, otherwise the consumer cannot read the data back - -> we would need to introduce multiple consumers/producers to allow for this case - there is also the question to which cluster repartition/changelog topics should be placed? - some people might even want to have a third cluster for this making it even more complex - while possible in general, it's a non-trivial architectural change 2) Kafka's exactly-once guarantees only work on a single cluster (this is hard to change) What kind of processing do you want to do? Kafka Connect supports single-message-transforms than can carry you a long way already. - -Matthias On 2/13/20 9:15 AM, Ryanne Dolan wrote: > Cyrille, I don't see why using MM1/2 would break your isolation > requirement. But if you can't mirror topics for some reason > consider Flink instead of Kafka Streams. > > Ryanne > > On Thu, Feb 13, 2020 at 10:52 AM Cyrille Karmann > <cyri...@nnamrak.org> wrote: > >> Hello, >> >> We are trying to create a streaming pipeline of data between >> different Kafka clusters. Our users send data to the input Kafka >> cluster, and we want to process this data and send the result to >> topics on another Kafka cluster. >> >> We have different reasons for this setup, but mainly it's for >> isolation: the two clusters don't have to have the same >> configuration and the first "input" Kafka cluster is critical: we >> want to be able to do maintenance on the second cluster without >> impacting the first one. Also we have more than a thousand topics >> on each side so managing them separately is easier. >> >> We are investigating different technologies for the processing >> part, and Kafka Streams looked promising except it is apparently >> not supporting to write in a different cluster as the one it is >> reading from. >> >> I saw people on forums suggesting to write in the first cluster >> and use MirrorMaker to channel the data to the output cluster. >> This breaks our isolation requirements and add more latency so we >> don't want to do that. >> >> I have two questions: >> >> - Is there a reason behind the constraint that Kafka Streams can >> not produce to a different cluster? I see that Kafka Streams >> allow to specify different configuration for the producer but it >> explicitly disallow it for >> ProducerConfig.BOOTSTRAP_SERVERS_CONFIG so it definitely >> something the developers did not want to support ( >> >> https://kafka.apache.org/20/javadoc/org/apache/kafka/streams/StreamsC onfig.html#getMainConsumerConfigs-java.lang.String-java.lang.String- >> >> ) >> but I am not clear why it is so. >> >> - At the same time, there is the KafkaClientSupplier mechanism >> that allows to inject our own KafkaProducer. I was actually >> successful in injecting such a KafkaProducer that connects to a >> different cluster. The fact that I am able to do, using a >> not-very documented API, something that other parts of the Kafka >> Streams library try to prevent me to do, makes me wonder if I am >> breaking something while doing this? In particular one thing >> important to me is exactly-once processing so I want to be sure >> it would still work. >> >> Thanks, Cyrille >> > -----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEEI8mthP+5zxXZZdDSO4miYXKq/OgFAl5Fi+4ACgkQO4miYXKq /Ogxhw/+MLEVYn6cJSBL9BCwA3goeRttJO1aZXbDuM4KJHNsnioWNbtm+A102MKy ED4a2/Hvuc9ShfhpMqYFGR0NROenTXH8F8KqXfxcavXQZPbOxZRq/VI3ppcMb3nG KqM9PxkyVNponY/XvTRU6hYPqaJ47SXZXLENumT5F0PbhkTyNjkfuEoxNzA2C364 3nB1QkYVtPYWdGOSxyCJmFfYhrYws92NW5ytixp3b57SlItGVkGMLBZgd18l+Gm/ Dzvtmj5gBHmr8ncoum9UvUCgNBTLPlv2WrKo4wIEdjXJFnsK0UsXDHmCoO54Z55O xik2+dIPo/x31Zyq4DOJIVCvaOOQviGwuoJPKCUhbYwtLsoIKqlE9AgPlPi25T/k ifJ9+MxxlNjtkAKaVhVXxY9FAIrqcev5dQZA1C34YfNFt/SrMKZmf1nudX3b1thE 8KDrPucymgZTZLjP2ITgrrKb6j2OCzNS9CfKGC+qN/jH2SM3U6oAUhbgLmK2uMCT sdTaUu0Gp+BY4C0+WKcrYapuj5r46sBqYdH5h9gbzlwpU6ghwFFrDjgA00Z3Xh9v 5Ku30/w0ZKzRZuhRpwJpygNTqzV3vf1aXyflPdXlyS8pqmf5xKoa0b6dvsG4t/z+ o9ceDgoIErpL9vnIKqeUPwt8wLFZoiMwYaH4zw87cQ72E8svLrU= =XqS6 -----END PGP SIGNATURE-----