Hello Bart,

Before we talk about dynamic subscription in Streams I'd like to ask some
more questions about your usage.

Let's say the current running application is reading from topic A, and have
just reached offset 100; now from the global state there is a notification
saying "stop on topic A, start on topic B". How would you start fetching
from topic B? Would you start at the earliest offset or latest offset (so
it is totally irrelevant to the point you stopped at topic A), or would you
start at offset 100? In addition, if your application maintains some local
state, how would the state be affected by the switching of the topic, since
at that point it is actually reflecting "the state of the application up to
topic A's offset 100", could you still reuse that state with topic B at the
specified offset?

To make things a bit more complicated, if later another notification from
the global state has arrived saying "now switch back to topic A again",
would you restart at where you stopped, i.e. offset 100, or you'll stop at
the earliest or latest offset of topic A, are the application states still
reusable?

I think if your answer to these questions are "no", then you'd better treat
them as different applications, i.e. one app reading from A, and one app
reading from B but with similar topology. It's just that based on the meta
topic the apps may be stopped / restarted dynamically. Currently Kafka
Streams do not have support for dynamic support yet, since lots of such
request scenarios turned out to be not really fit to be collapsed into a
single applications; if there is indeed a common request, I think one way
to do that is to make PartitionAssignor customizable by users as you
suggested, so that only the selected partitions are used to re-form the
tasks; but one still need some way to trigger a rebalance so that the
PartitionAssignor can be called.


Guozhang

On Fri, Aug 11, 2017 at 1:42 AM, Bart Vercammen <b...@cloutrix.com> wrote:

> Hi,
>
> I have a question basically on how it would be the best way to implement
> something within Kafka Streams.  The thing I would like to do: "dynamically
> update the subscription pattern of the source topics.
>
> The reasoning behind this (in my project):
> meta data about the source topics is evented on an other kafka topic, that
> should be tracked by the kafka streams topology, and depending on that meta
> data specific source topics should be added, or removed from the kafka
> streams topology.
>
> Currently I track the "meta data topic" as "global state", so that every
> processor can actually access it to fetch the meta data (this meta data for
> instance also describes whether or not a specific topic pattern should be
> tracked by the stream processor) - so consider this as some kind of
> "configuration" stream about the source topics.
>
> So now it comes,
> Is there any way I could (from a running topology) update the kafka
> consumer subscriptions?
> So that I'm able to replace the source topic pattern while the topology is
> running?
>
> I don't think there currently is a way to do this, but as under the hood it
> is just a kafka consumer, my believe is that it should be possible somehow
> ...
>
> I was thinking about the PartitionAssigner ... if I could get my hands on
> that one, maybe I could dynamically configure it to only allow specific
> topic-patterns?
> Or directly alter the subscription on the underlying consumer?
>
> I don't know all the nifty details about the Kafka Streams internals, so it
> would be nice if someone could direct me in the right direction to achieve
> this ...
>
> Thanks,
> Bart
>



-- 
-- Guozhang

Reply via email to