Hi there,

I have several different production servers running in different cloud
providers, like AWS, Rackspace, ..etc. And I have one central big data
environment inside company's network. I need to build a NRT(near real time)
monitoring system which I need to stream and archive all the logs from all
those different servers into the central place.

I have read through the documentation of Kafka and found that it seems
like Kafka
Connect <http://kafka.apache.org/documentation.html#quickstart_kafkaconnect>
does what I want. However, instead of connecting to a local file, I really
need to connect to a file that is on a remote server. Even more
challenging, due to the existence of the firewall, I can only pull from the
remote server into the big data environment instead of pushing from the
remote to the big data environment.

Is there any built in functionality in Kafka does what I want? If not, what
is the best practice to architect this?

I have seen how Splunk and Sumologic and really amazed at how well they
works, however, at least Sumologic requires to put a connector on the
remote server and it will periodically check and update the logs to sumo's
server. It is kind of cool but really won't work in my scenario due to the
firewall.

I have asked something similar on Stackoverflow
<http://stackoverflow.com/questions/34498946/kafka-pull-logs-from-remote-servers>
in case you are interested in getting a few points there :)

Best regards,

Bin

Reply via email to