Re: Kafka Streams scaling questions

2016-03-22 Thread Kishore Senji
Hi Ben, Thank you for the reply. Let us say we are processing apache access logs (or something similar) for a service which is served by n nodes. We would want to process 1) errors / consumer (identified by client ip or something in headers) 2) errors / URL. These are different group-by that we n

Re: Kafka Streams scaling questions

2016-03-22 Thread Ben Stopford
Hi Kishore In general I think it’s up to you to choose keys that keep related data together, but also give you reasonable load balancing. I’m afraid that I’m not sure I fully followed your explanation of how storm solves this problem more efficiently though. I noticed you asked: "How would th