GSoC work for Aurora and some updates

Ahmed Aley Wed, 21 May 2014 23:16:24 -0700

Hi,

I am Ahmed Ali-Eldin <https://www8.cs.umu.se/~ahmeda/>, a PhD student at
Umeå University, Sweden (It is up
north<http://tools.wmflabs.org/geohack/geohack.php?pagename=Ume%C3%A5&params=63_49_30_N_20_15_50_E_type:city%2879594%29_region:SE>:)
). I am working with @MarkCC on integrating a distributed logging
framework with Aurora and building an analytics framework on top to analyze
the logged data.
We started off by looking into different logging frameworks
(Kafka<http://kafka.apache.org/>,
Scribe <https://github.com/facebook/scribe>,
Chukwa<https://chukwa.apache.org/>,
Suro<http://techblog.netflix.com/2013/12/announcing-suro-backbone-of-netflixs.html>,
Calligraphus<http://www-conf.slac.stanford.edu/xldb2011/talks/xldb2011_tue_0940_facebookrealtimeanalytics.pdf>and
Flume <http://flume.apache.org/>). We chose Suro coupled with Kafka out of
these for different reasons.
i- It has been built to allow scale-up and down (elastic).
ii- It is quite flexible with a Kafka sink giving us access to all Kafka
sinks.
iii- It has an S3 sink making it a suitable solution for more scenarios.
iv- I got a tip from someone I know at Netflix on Suro benchmarking results.
v- it is an active project


Based on the above, I have started some experiments with Suro and will be
looking at its integration with Aurora this weekend. I can not make any
statements on if Suro (coupled with kafka) is "the best" solution for
distributed logging but it looks very promising till now. I will hopefully
send some results/updates late next week.

Best,
--Ahmed

GSoC work for Aurora and some updates

Reply via email to