I think what you're describing could be handled in KStreams by a "global"
KTable. This functionality is currently being discussed/voted on in a KIP
discussion: https://cwiki.apache.org/confluence/pages/viewpa
ge.action?pageId=67633649 The list of interests would be a global KTable
(shared globally across all streams instances) and each instance would
filter/categorize based on that table via a join.

-Ewen

On Fri, Dec 30, 2016 at 3:02 PM, Matt King <kyrri...@gmail.com> wrote:

> I'd like to have the following:
>
> One large stream of content coming through a topic, with Kafka Stream
> filtering to identify records of interest.   I can see how this would be
> sharded to allow scale out to handle a large stream of content.
>
> I would like to have a 2nd, smaller, topic to define the areas of interest
> that would be used for the filtering.  This topic should be available to
> all the stream processing filters.  When a new filter comes up it should be
> able to recreate its state.   As area of interest definitions change these
> changes should also go out to all the filtering applications.
>
> Can this be done directly with Kafka Streams?  I can get the primary stream
> working with a static set of interests and the filtering works fine.  But
> adding in a second input stream I'm having trouble.  There doesn't seem to
> be a way to have the same topic/partition go to all the applications?
> Alternatively I could imagine broadcasting the interests to multiple
> partitions but don't see how that is done.
>
> Perhaps the area-of-interest topic should be done using a plain old Kafka
> producer, sending it to all partitions?
>
> Am I making sense?
>
> Happy New Year
>
> Matt
>

Reply via email to