Re: Dinamically asigning output topic

2015-08-03 Thread Karthik Sriram
I believe you only need to declare the system stream you are writing to before hand . ie your Kafka broker list if you are using Kafka. If auto create topics is enabled, you can dynamically write to any topic within a predefined system stream. I'd suggest simply use a hash map of topic to OutputStr

Re: Windowing Guarantees in samza

2015-02-15 Thread Karthik Sriram
Hi Benjamin, I'm trying to employ samza for a similar use case and this is what I did to mitigate this: 1> I have a notion of timestamp in the messages itself that I listen to. This way, as I get messages, I can maintain state by time period of aggregation by attaching the time period to the key

Re: Collocating Samza(YARN) and Kafka/ZK clusters

2015-02-13 Thread Karthik Sriram
replica out > of > > > sync if it falls more than N messages behind. Can you try tuning this > > > setting as described here: > > > https://cwiki.apache.org/confluence/display/KAFKA/FAQ# > > > FAQ-HowtoreducechurnsinISR?WhendoesabrokerleavetheISR > > > ? &g

Collocating Samza(YARN) and Kafka/ZK clusters

2015-02-09 Thread Karthik Sriram
Hey all, I'm trying to run samza on a 5 node (YARN/Kafka/ZK) cluster with each box running all 3 processes on AWS. I have been facing very weird performance issues with Kafka when run this way. Kafka seems to get unbalanced very often with replicas going out of sync every so often. This results in