I believe you only need to declare the system stream you are writing to
before hand . ie your Kafka broker list if you are using Kafka. If auto
create topics is enabled, you can dynamically write to any topic within a
predefined system stream. I'd suggest simply use a hash map of topic to
OutputStr
Hi Benjamin,
I'm trying to employ samza for a similar use case and this is what I did
to mitigate this:
1> I have a notion of timestamp in the messages itself that I listen to.
This way, as I get messages, I can maintain state by time period of
aggregation by attaching the time period to the key
replica out
> of
> > > sync if it falls more than N messages behind. Can you try tuning this
> > > setting as described here:
> > > https://cwiki.apache.org/confluence/display/KAFKA/FAQ#
> > > FAQ-HowtoreducechurnsinISR?WhendoesabrokerleavetheISR
> > > ?
&g
Hey all,
I'm trying to run samza on a 5 node (YARN/Kafka/ZK) cluster with each box
running all 3 processes on AWS. I have been facing very weird performance
issues with Kafka when run this way. Kafka seems to get unbalanced very
often with replicas going out of sync every so often. This results in