Sure. Sorry I was not clear. Thank you!
lör 10 mars 2018 kl. 00:54 skrev Matthias J. Sax <matth...@confluent.io>: > If there is only one partition by task, processing order is guaranteed. > > For default partitions grouper, it depends on your program. If you read > from multiple topics and join/merge them, a task gets multiple > partitions (from different topics) assigned. > > > -Matthias > > On 3/9/18 2:42 PM, Stas Chizhov wrote: > >> Also note, that the processing order might slightly differ if you > > process the same data twice .... > > > > Is this still a problem when default partition grouper is used (with 1 > > partition per task)? > > > > Thank you, > > Stanislav. > > > > > > > > 2018-03-09 3:19 GMT+01:00 Matthias J. Sax <matth...@confluent.io>: > > > >> Thanks for the explanation. > >> > >> Not sure if setting the metadata you want to get committed in > >> punctuation() would be sufficient. But I would think about it in more > >> details if we get a KIP for this. > >> > >> It's correct that flushing and committing offsets is correlated. But > >> it's not related to punctuation. > >> > >> Also note, that the processing order might slightly differ if you > >> process the same data twice (it depends in which order the brokers > >> return data on poll() and that it something Streams cannot fully > >> control). Thus, you code would need to be "robust" against different > >> processing orders (ie, if there are multiple input partitions, you might > >> get data first for partition 0 and there for partition 1 or the other > >> way round -- the order per partitions is guaranteed to be in offset > order). > >> > >> > >> -Matthias > >> > >> > >> > >> On 3/6/18 2:17 AM, Stas Chizhov wrote: > >>> Thank you, Matthias! > >>> > >>> We currently do use kafka consumer and store event time highwatermarks > as > >>> offset metadata. This is used during backup procedure, which is to > >> create a > >>> snapshot of the target storage with all events up to certain timestamp > >> and > >>> no other. > >>> > >>> As for the API - I guess being able to provide partition-to-metadata > map > >> in > >>> the context commit method would do it (to be called from within > punctuate > >>> method). BTW as far as I understand if Processor API is used flushing > >>> producers and committing offsets is correlated and both output topic > >> state > >>> and committed offsets do correspond to a state at the moment of some > >>> punctuation. Meaning that if I do have a deterministic processing > >> topology > >>> my output topic is going to be deterministic as well (modulo duplicates > >> of > >>> course). Am I correct here? > >>> > >>> Best regards, > >>> Stanislav. > >>> > >>> > >>> 2018-03-05 2:31 GMT+01:00 Matthias J. Sax <matth...@confluent.io>: > >>> > >>>> You are correct. This is not possible atm. > >>>> > >>>> Note, that commits happen "under the hood" and users cannot commit > >>>> explicitly. Users can only "request" as commit -- this implies that > >>>> Kafka Streams will commit as soon as possible -- but when > >>>> `context#commit()` returns, the commit is not done yet (it only sets a > >>>> flag). > >>>> > >>>> What is your use case for this? How would you want to use this from an > >>>> API point of view? > >>>> > >>>> Feel free to open a feature request JIRA -- we don't have any plans to > >>>> add this atm -- it's the first time anybody asks for this feature. If > >>>> there is a JIRA, maybe somebody picks it up :) > >>>> > >>>> > >>>> -Matthias > >>>> > >>>> On 3/3/18 6:51 AM, Stas Chizhov wrote: > >>>>> Hi, > >>>>> > >>>>> There seems to be no way to commit custom metadata along with offsets > >>>> from > >>>>> within Kafka Streams. > >>>>> Are there any plans to expose this functionality or have I missed > >>>> something? > >>>>> > >>>>> Best regards, > >>>>> Stanislav. > >>>>> > >>>> > >>>> > >>> > >> > >> > > > >