Re: Kafka Mirror Maker place of execution

Anatoliy Soldatov Wed, 13 Mar 2019 01:51:33 -0700

Hi, Franz!

I guess, one of the reasons could be additional safety in case of network split.


It is also some probability of bugs even with good software. So, If we place MM 
on source cluster and network will split, consumers could (theoretically) 
continue to read messages from source cluster and commit them even without asks 
from destination cluster (one of possible bugs). This way you will end up with 
lost messages on producer after network fix.

On the other hand, if we place MM on destination cluster and network will 
split, nothing bad happens. MM will be unable to grep data from source cluster, 
so you data won’t corrupt even in case of bugs.

Tolya

> 12 марта 2019 г., в 16:13, Franz van Betteraey <fvbetter...@web.de> 
> написал(а):
>
> Hi all,
>
> there are best practices out there which recommend to run the Mirror Maker on 
> the target cluster.
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcommunity.hortonworks.com%2Farticles%2F79891%2Fkafka-mirror-maker-best-practices.html&amp;data=02%7C01%7Caksoldatov%40avito.ru%7Cf6f363750db64757357708d6a6ec7f53%7Caf0e07b3b90b472392e63fab11dd5396%7C1%7C0%7C636879932006951135&amp;sdata=tdZsL7ChncqRpDAjspgSHVcuHuxt0nYAATpjsQpFcS8%3D&amp;reserved=0
>
> I wonder why this recommendation exists because ultimately all data must 
> cross the border between the clusters, regardless of whether they are 
> consumed at the target or produced at the source. A reason I can imagine is 
> that the Mirror Maker supports multimple consumer but only one producer - so 
> consuming data on the way with the greater latency might be speed up by the 
> use of multiple consumers.
>
> If performance because of multi threading is a point, would it be usefaul to 
> use several producer (one per consumer) to replicate the data (with a custom 
> replication process)? Does anyone knows why the Mirror Maker shares a single 
> producer among all consumers?
>
> My usecase is the replication of data from several source cluster (~10) to a 
> single target cluster. I would prefer to run the replication process on the 
> source cluster to avoid to many replication processes (each for one source) 
> on the target cluster.
>
> Hints and suggestions on this topic are very welcome.
>
> Best regards
>  Franz
>
> If you would like to earn some SO recommendation points feel free to answer 
> this question on SO ;-)
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstackoverflow.com%2Fq%2F55122268%2F367285&amp;data=02%7C01%7Caksoldatov%40avito.ru%7Cf6f363750db64757357708d6a6ec7f53%7Caf0e07b3b90b472392e63fab11dd5396%7C1%7C0%7C636879932006961147&amp;sdata=heGySGNqyNnzwrg4IeUmJ26GT5r0cgPU%2F%2BX4lNLRehc%3D&amp;reserved=0


________________________________
"This message contains confidential information/commercial secret. If you are 
not the intended addressee of this message you may not copy, save, print or 
forward it to any third party and you are kindly requested to destroy this 
message and notify the sender thereof by email.
Данное сообщение содержит конфиденциальную информацию/информацию, являющуюся 
коммерческой тайной. Если Вы не являетесь надлежащим адресатом данного 
сообщения, Вы не вправе копировать, сохранять, печатать или пересылать его 
каким либо иным лицам. Просьба уничтожить данное сообщение и уведомить об этом 
отправителя электронным письмом.”

Re: Kafka Mirror Maker place of execution

Reply via email to