Hi, I am trying to figure out best deployment plan and configuration with ops to ship new version of our system that will use kafka. Multiple geo-distributed datacenters are a given, and we are planning to build central DC to aggregate the data.
Ops proposed to set up mirror to work over open internet channel without secured vpn. Security of this particular data is not a concern and, as I understood, it will give us more bandwidth (unless we buy some extra hardware, lot's of internal details there). Is this configuration possible at all? Have anyone tried/using such configuration? I'd appreciate any feedback. Major source of confusion is how MirrorMaker/other producers would handle external names for the brokers. As I understand, producer connects to the broker in the configuration only to bootstrap (get list of all available brokers), and after that talks to the brokers received during bootstrapping. So local clients won't work (or will route to external interface) if I configure brokers to use external names. Remote clients won't work if internal names configured. Is there some reasonable way to configure kafka to support such scenario? So far I only tried opening ssh tunnel from devbox to remote machine and configuring local producer to talk to localhost, it failed as described above. Also, should I run MirrorMaker in the same DC as central kafka cluster or multiple MirrorMakers in remote DCs? Any description of how it is setup in your case is helpful. Do you use vpn between DCs? Where do you run MirrorMaker - in central dc or in remote and why? A lot of question, thank you beforehand for your answers. ---------- Andrey Yegorov