Hi Warren, Yes, I think Hello Samza is the template project to work from. I believe that the slow message rate that you are seeing is because it's subscribed to the the wikipedia IRC stream which may only generate a few events per second.
That said, some of the example configuration for the hello samza demo is not tuned for performance. In general, enabling compression can help a lot for jobs that are I/O bound. Enabling lz4 on JSON data, for example, shrinks it 10x. On the consumer side, setting task.consumer.batch.size might help. On the producer side, you might want to play around with these settings. systems.kafka.producer.compression.type systems.kafka.producer.batch.size systems.kafka.producer.linger.ms http://samza.apache.org/learn/documentation/0.9/jobs/configuration-table.html http://kafka.apache.org/documentation.html#newproducerconfigs Cheers, Roger On Thu, Apr 9, 2015 at 1:14 AM, Warren Henning <warren.henn...@gmail.com> wrote: > Hi, > > I ran the commands in http://samza.apache.org/startup/hello-samza/0.9/ > successfully. Fascinating stuff! > > I was running all the processes on my (fairly recent model) Macbook Pro. > One aspect I've heard about Kafka and Samza is performance -- handling > thousands of messages a second. E.g., > > http://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines > talks about doing millions of writes a second. The rate at which the > console emitted new messages seemed like a rate far slower than that -- > maybe something on the order of 1-2 a second. I ran the commands and > everything exactly as is listed on the tutorial page. > > Of course a laptop is vastly different from a production setup -- what kind > of assumptions can you make about performance of Samza jobs in development > mode? I realize it depends on what you're doing -- it's just very different > from what I was expecting. > > Also, I'm not really sure about the best way to get started with writing my > own Samza jobs. Is there a project template to work off of? Is the Hello > Samza project it? Maybe import the Maven POM into a favorite IDE and rip > out the Wikipedia-related classes? As someone who has written Java before > but doesn't write it every day, it wasn't immediately clear to me. > > Apologies if these are addressed in blog posts/FAQs/documentation and I > failed to research them adequately. > > Thanks! > > Warren >