As mentioned, a commit can---from a contract point of view---happen anytime. Of course, we only commit offsets of records that are fully processed. As punctuations are independent of records, there is no guarantee when it will be called though.
Currently, we do the following sequence (but this is just an implementation and can change anytime): https://github.com/apache/kafka/blob/0.11.0.1/streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamThread.java#L510-L524 You see that we call process(), afterwards maybePunctuate() that might call the punctuation() if enough time passed, and finally we call maybeCommit() that might commit, if commit interval passed. I guess nothing prevents you from implementing the batch write pattern you describe. You should however, not pass data in-memory from process() to punctuate(), but use a state store that is backed reliably -- otherwise, there is no guarantee that you don't loose data if an error occurs in-between. Don't think, that a plain Consumer/Producer would be better here. If something goes wrong, Streams will replay data for you similar to a consumer. Hope this helps! -Matthias On 10/23/17 6:58 PM, Tobias Adamson wrote: > Hi Matthias > Does this mean that an offset can be committed before process or punctuate is > called > Or that it could be called up to 30s after process/punctuate is called? > > We would like to do batch writes in punctuate of X amount messages gathered > in process. > If the process or punctuate step fails / stream instance restarts we would > like to reprocess the batch again > Is this a use case that it is intended for or should we use a normal consumer > instance? > > Toby > >> On 24 Oct 2017, at 2:27 AM, Matthias J. Sax <matth...@confluent.io> wrote: >> >> Committing is independent of process and/or punctuate. >> >> You can configure your Kafka Streams application commit interval to any >> value you like via `commit.interval.ms` parameter (default is 30 seconds). >> >> Thus, there is no guarantee when a commit exactly happens with regard to >> calling process and punctuate. >> >> >> -Matthias >> >> On 10/22/17 11:43 PM, Tobias Adamson wrote: >>> Hi >>> What is the contract around Processor.process and punctuate. >>> >>> When will Kafka streams commit the offset >>> After the process method is called successfully or not until punctuate is >>> called? >>> >>> Regards >>> Toby >>> >> >
signature.asc
Description: OpenPGP digital signature