One or multiple instances of MM to aggregate kafka data to one hadoop

Mingjie Lai Wed, 28 Jan 2015 14:14:57 -0800

Hi.

We have a pretty typical data ingestion use case that we use mirrormaker at
one hadoop data center, to mirror kafka data from multiple remote
application data centers. I know mirrormaker can support to consume kafka
data from multiple kafka source, by one instance at one physical node. By
this, we can give one instance of mm multiple consumer config files, so it
can consume data from muti places.


Another option is to have multiple mirrormaker instances at one node, each
mm instance is dedicated to grab data from one single source data center.
Certainly there will be multiple mm nodes to balance the load.

The second option looks better since it kind of has an isolation for
different data centers.

Any recommendation for this kind of data aggregation cases?

Still new to kafka and mirrormaker. Welcome any information.

Thanks,
Mingjie

One or multiple instances of MM to aggregate kafka data to one hadoop

Reply via email to