Mattias,
   That's disappointing given that Kafka offers me the ability to rewind and 
replay data. My use case is that we are building graph data structures based on 
data indexed from a live stream. At any time, the live data content may be 
marked for deletion for any number of reasons; but during that marking process 
if a graph structure is being built, it may not realize the data was marked for 
deletion (i.e. there is a race between graph referencing the data and the data 
being removed). 

   We need to be able to subsequently go back and clean up the graph data once 
we realize the graph contains data that was marked for deletion. But we can't 
delete/cleanup the graph until it completes...so we thought we could track all 
data referenced by the graph being created and once it was complete, 
subsequently replay the data references and determine if any were marked for 
removal and subsequently clean up the graph. We hoped that by sending 
"start/end" indicators into a graph data reference topic, some KStreams flow 
could see the "end", recognize that the graph completed, and simply replay all 
its data references to cleanup the graph. I guess we could use a standard 
consumer and do this outside of KStreams. Not a big deal...was just hoping to 
keep things in the KStreams realm. I'm sure there are other ways to solve this 
even outside of using Kafka at all; but why do that? :)
Mike

 

    On Thursday, June 2, 2016 8:59 AM, Matthias J. Sax <matth...@confluent.io> 
wrote:
 

 Hi Mike,

currently, this is not possible. We are already discussing some changes
with regard to reprocess. However, I doubt that going back to a specific
offset of a specific partition will be supported as it would be too
difficult to reset the internal data structures and intermediate results
correctly (also with regard to committing)

What is your exact use case? What kind of feature are you looking for?
We are always interested to get feedback/idea from users.


-Matthias

On 06/01/2016 08:21 PM, Michael D. Coon wrote:
> All,
>  I think it's great that the ProcessorContext offers the partition and offset 
>of the current record being processed; however, it offers no way for me to 
>actually use the information. I would like to be able to rewind to a 
>particular offset on a partition if I needed to. The consumer is also not 
>exposed to me so I couldn't access things directly that way either. Is this in 
>the works or would it interfere with rebalancing/auto-commits?
> Mike
> 
> 


  

Reply via email to