As mentioned, a commit can---from a contract point of view---happen
anytime. Of course, we only commit offsets of records that are fully
processed. As punctuations are independent of records, there is no
guarantee when it will be called though.

Currently, we do the following sequence (but this is just an
implementation and can change anytime):
https://github.com/apache/kafka/blob/0.11.0.1/streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamThread.java#L510-L524

You see that we call process(), afterwards maybePunctuate() that might
call the punctuation() if enough time passed, and finally we call
maybeCommit() that might commit, if commit interval passed.


I guess nothing prevents you from implementing the batch write pattern
you describe. You should however, not pass data in-memory from process()
to punctuate(), but use a state store that is backed reliably --
otherwise, there is no guarantee that you don't loose data if an error
occurs in-between.

Don't think, that a plain Consumer/Producer would be better here. If
something goes wrong, Streams will replay data for you similar to a
consumer.


Hope this helps!


-Matthias

On 10/23/17 6:58 PM, Tobias Adamson wrote:
> Hi Matthias
> Does this mean that an offset can be committed before process or punctuate is 
> called
> Or that it could be called up to 30s after process/punctuate is called?
> 
> We would like to do batch writes in punctuate of X amount messages gathered 
> in process.
> If the process or punctuate step fails / stream instance restarts we would 
> like to reprocess the batch again
> Is this a use case that it is intended for or should we use a normal consumer 
> instance?
> 
>  Toby
> 
>> On 24 Oct 2017, at 2:27 AM, Matthias J. Sax <matth...@confluent.io> wrote:
>>
>> Committing is independent of process and/or punctuate.
>>
>> You can configure your Kafka Streams application commit interval to any
>> value you like via `commit.interval.ms` parameter (default is 30 seconds).
>>
>> Thus, there is no guarantee when a commit exactly happens with regard to
>> calling process and punctuate.
>>
>>
>> -Matthias
>>
>> On 10/22/17 11:43 PM, Tobias Adamson wrote:
>>> Hi
>>> What is the contract around Processor.process and punctuate.
>>>
>>> When will Kafka streams commit the offset
>>> After the process method is called successfully or not until punctuate is 
>>> called?
>>>
>>> Regards
>>> Toby
>>>
>>
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to