Dear All: I was looking for the tutorial how to build and run Samza on AWS and then I found a link below. I am wondering if there is a detail tutorial about how to build Samza on AWS?
Sincerely, Selina https://cwiki.apache.org/confluence/display/SAMZA/FAQ#FAQ-HowshouldSamzaberunonAWS? How should Samza be run on AWS? >From Gian Merlino: - We've been using Samza in production on AWS for a little over a month. We're just using the YARN runner on a mostly stock hadoop 2.4.0 cluster (not EMR). Our experience is that c3s work well for the YARN instances and i2s work well for the Kafka instances. Things have been pretty solid with that setup. For scaling up and scaling down YARN, we just terminate instances or add instances, and this works pretty well. It can take a few minutes for the cluster to realize a node has gone and respawn containers elsewhere. We have a separate Kafka cluster just for Samza's use, different from our main Kafka cluster. The main reason is that we wanted to isolate off the disk and network load of state compactions and restores (we don't use compacted topics in our main Kafka cluster, but we do use them with Samza, and the extra load on Kafka can be substantial).
