Hi, I am Ahmed Ali-Eldin <https://www8.cs.umu.se/~ahmeda/>, a PhD student at UmeĆ„ University, Sweden (It is up north<http://tools.wmflabs.org/geohack/geohack.php?pagename=Ume%C3%A5¶ms=63_49_30_N_20_15_50_E_type:city%2879594%29_region:SE>:) ). I am working with @MarkCC on integrating a distributed logging framework with Aurora and building an analytics framework on top to analyze the logged data. We started off by looking into different logging frameworks (Kafka<http://kafka.apache.org/>, Scribe <https://github.com/facebook/scribe>, Chukwa<https://chukwa.apache.org/>, Suro<http://techblog.netflix.com/2013/12/announcing-suro-backbone-of-netflixs.html>, Calligraphus<http://www-conf.slac.stanford.edu/xldb2011/talks/xldb2011_tue_0940_facebookrealtimeanalytics.pdf>and Flume <http://flume.apache.org/>). We chose Suro coupled with Kafka out of these for different reasons. i- It has been built to allow scale-up and down (elastic). ii- It is quite flexible with a Kafka sink giving us access to all Kafka sinks. iii- It has an S3 sink making it a suitable solution for more scenarios. iv- I got a tip from someone I know at Netflix on Suro benchmarking results. v- it is an active project
Based on the above, I have started some experiments with Suro and will be looking at its integration with Aurora this weekend. I can not make any statements on if Suro (coupled with kafka) is "the best" solution for distributed logging but it looks very promising till now. I will hopefully send some results/updates late next week. Best, --Ahmed