Broad cast stream is the Samza feature designed for this exact requirement. Your thinking is on the right lines.
Please take a look at task.broadcast.inputs from the Samza configuration reference page. On Friday, March 18, 2016, Louisia Famalda <louisia.fama...@gmail.com> wrote: > Hi, > > I'm new to Samza and I'm trying to do this: > From my standalone app, I'm writing some ActionEvents to a Kafka topic with > 5 partitions > > From Samza, I want to process those events but I need to send some filters > information on how to process those events. > For example, I want to keep all messages that are coming from a specific > ZipCode. > > > My thinking is to have another Topic: ControlEventStream, which will be a > bootstrap kafka stream. > ControlEvent will contains a userId as the key and a list of zipCode as a > value. > My Samza task will read from those 2 topics, get the filter directives from > the ControlEventStream and then start processing all events. > A user of the system, will then be able to send a ControlEvent with a new > list of ZipCode that he wants to watch for. > * The number of users using the system is less than 10. > > - Am I on the right track by using bootstrap stream + compaction to define > the list of filters that a specific user is interested in ? > > Now, I need the Samza instantiated tasks to be able to receive update so a > user can Add/remove/update the list of ZipCode that he is looking for. > All ActionEvents are distributed cross all partitions and process in a > distributed way. > > How do I guarantee that all Samza tasks will all receive ALL messages from > ControlEventStream, so they all have the same set of ZipCode filters. > > Thanks in advance, > Louisia. > -- Sent from my iphone.