Hey all,
 I'm trying to run samza on a 5 node (YARN/Kafka/ZK) cluster with each box
running all 3 processes on AWS. I have been facing very weird performance
issues with Kafka when run this way. Kafka seems to get unbalanced very
often with replicas going out of sync every so often. This results in lost
messages when producing to this cluster. I initially suspected it was a
scale issue (70k-80k qps of incoming messages, ~120k qps peak) and reduced
write throughput by sampling just 10% of the messages but I still noticed
the same issues. The weird part is that this doesn't happen every time I
run, but many of the times.

We have been using a much larger Kafka cluster for long with great
performance and have never seen such issues before. Then I saw (
https://engineering.linkedin.com/samza/operating-apache-samza-scale) which
mentions that LinkedIn also faced some issues when collocating Samza and
Kafka.

Can someone throw some light on this? Is collocating samza and kafka a
strict no, or is it more likely a Kafka/machine tuning issue ? Any help is
appreciated!

Kafka version : 0.8.1.1
Samza version: 0.8

Thanks a lot for your time,
Karthik

Reply via email to