Hi folks,
I needed bit of feedback from you based on your experiences using kafka
streaming application.

We have a replicated kafka cluster running in a data center in one city.
We are running a kafka streaming application which reads from a source
topic from that cluster and commits the output into local database in its
own data center.

The distance between these two data center is about 1000 miles, with high
latency(20 - 70 ms) 100 mbps connection between the two.

Our source topic receives 10,000 message per second and a message size is
around 4 KB.

Since the streaming application receives lot of messages, aggregates them
and again sends aggregated messages to a changelog topic, and then again
reads from changelog topic and updates local store. This is a continuous
process, with changelog topic message size may grow upto 100KB to 750KB.

So you get an idea that there is lot of network data exchange to and fro
between 2 data centers.

In such a scenario is it advisable to run streaming application in a WAN
kind of setup or it is better to move the streaming application within the
LAN of kafka cluster.

We seem to be running into some request timeout issues when running the
application on a WAN vs LAN and needed to know if network connection
between the two could be the issue.


Please let me know your thoughts.

Thanks
Sachin

Reply via email to