Mattias,
That's disappointing given that Kafka offers me the ability to rewind and
replay data. My use case is that we are building graph data structures based on
data indexed from a live stream. At any time, the live data content may be
marked for deletion for any number of reasons; but during that marking process
if a graph structure is being built, it may not realize the data was marked for
deletion (i.e. there is a race between graph referencing the data and the data
being removed).
We need to be able to subsequently go back and clean up the graph data once
we realize the graph contains data that was marked for deletion. But we can't
delete/cleanup the graph until it completes...so we thought we could track all
data referenced by the graph being created and once it was complete,
subsequently replay the data references and determine if any were marked for
removal and subsequently clean up the graph. We hoped that by sending
"start/end" indicators into a graph data reference topic, some KStreams flow
could see the "end", recognize that the graph completed, and simply replay all
its data references to cleanup the graph. I guess we could use a standard
consumer and do this outside of KStreams. Not a big deal...was just hoping to
keep things in the KStreams realm. I'm sure there are other ways to solve this
even outside of using Kafka at all; but why do that? :)
Mike
On Thursday, June 2, 2016 8:59 AM, Matthias J. Sax <[email protected]>
wrote:
Hi Mike,
currently, this is not possible. We are already discussing some changes
with regard to reprocess. However, I doubt that going back to a specific
offset of a specific partition will be supported as it would be too
difficult to reset the internal data structures and intermediate results
correctly (also with regard to committing)
What is your exact use case? What kind of feature are you looking for?
We are always interested to get feedback/idea from users.
-Matthias
On 06/01/2016 08:21 PM, Michael D. Coon wrote:
> All,
> I think it's great that the ProcessorContext offers the partition and offset
>of the current record being processed; however, it offers no way for me to
>actually use the information. I would like to be able to rewind to a
>particular offset on a partition if I needed to. The consumer is also not
>exposed to me so I couldn't access things directly that way either. Is this in
>the works or would it interfere with rebalancing/auto-commits?
> Mike
>
>