Hi, I ran the commands in http://samza.apache.org/startup/hello-samza/0.9/ successfully. Fascinating stuff!
I was running all the processes on my (fairly recent model) Macbook Pro. One aspect I've heard about Kafka and Samza is performance -- handling thousands of messages a second. E.g., http://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines talks about doing millions of writes a second. The rate at which the console emitted new messages seemed like a rate far slower than that -- maybe something on the order of 1-2 a second. I ran the commands and everything exactly as is listed on the tutorial page. Of course a laptop is vastly different from a production setup -- what kind of assumptions can you make about performance of Samza jobs in development mode? I realize it depends on what you're doing -- it's just very different from what I was expecting. Also, I'm not really sure about the best way to get started with writing my own Samza jobs. Is there a project template to work off of? Is the Hello Samza project it? Maybe import the Maven POM into a favorite IDE and rip out the Wikipedia-related classes? As someone who has written Java before but doesn't write it every day, it wasn't immediately clear to me. Apologies if these are addressed in blog posts/FAQs/documentation and I failed to research them adequately. Thanks! Warren