Hi,

I am trying to figure out best deployment plan and configuration with ops
to ship new version of our system that will use kafka. Multiple
geo-distributed datacenters are a given, and we are planning to build
central DC to aggregate the data.

Ops proposed to set up mirror to work over open internet channel without
secured vpn. Security of this particular data is not a concern and, as I
understood, it will give us more bandwidth (unless we buy some extra
hardware, lot's of internal details there).

Is this configuration possible at all? Have anyone tried/using such
configuration? I'd appreciate any feedback.

Major source of confusion is how MirrorMaker/other producers would handle
external names for the brokers. As I understand, producer connects to the
broker in the configuration only to bootstrap (get list of all available
brokers), and after that talks to the brokers received during
bootstrapping. So local clients won't work (or will route to external
interface) if I configure brokers to use external names. Remote clients
won't work if internal names configured.
Is there some reasonable way to configure kafka to support such scenario?
So far I only tried opening ssh tunnel from devbox to remote machine and
configuring local producer to talk to localhost, it failed as described
above.


Also, should I run MirrorMaker in the same DC as central kafka cluster or
multiple MirrorMakers in remote DCs?

Any description of how it is setup in your case is helpful. Do you use vpn
between DCs? Where do you run MirrorMaker - in central dc or in remote and
why?

A lot of question, thank you beforehand for your answers.

----------
Andrey Yegorov

Reply via email to