Re: Batch processing with Kafka Streams with at-least-once semantics

2018-10-16 Thread Tomoyuki Saito
Hi, I've read the guide below, and filed up a PR: https://github.com/apache/kafka/pull/5809 Started without creating a JIRA ticket. https://cwiki.apache.org/confluence/display/KAFKA/Contributing+Code+Changes Thank you, Tomoyuki On Wed, Oct 17, 2018 at 9:19 AM Tomoyuki Saito wrote: > Hi, > > >

Re: Batch processing with Kafka Streams with at-least-once semantics

2018-10-16 Thread Tomoyuki Saito
Hi, > Would you like to contribute a PR? Yes! Sounds great. Should I file a JIRA ticket first? Tomoyuki On Wed, Oct 17, 2018 at 12:19 AM Guozhang Wang wrote: > I think we should not allow negative values, and today it seems that this > is not checked against. > > In fact, it should be a o

Re: Batch processing with Kafka Streams with at-least-once semantics

2018-10-16 Thread Guozhang Wang
I think we should not allow negative values, and today it seems that this is not checked against. In fact, it should be a one-liner fix in the `config.define` function call to constraint its possible value range. Would you like to contribute a PR? Guozhang On Fri, Oct 12, 2018 at 10:56 PM Tomo

Re: Batch processing with Kafka Streams with at-least-once semantics

2018-10-12 Thread Tomoyuki Saito
Hello Guozhang, Thank you for your reply. > setting to "0" will actually mean to commit every time. Hum, I somehow misunderstood the code. Now I understand that is true. > You should actually set it to Long.MAX_VALUE to indicate "not commit regularly by intervals" I see. I'd consider taking th

Re: Batch processing with Kafka Streams with at-least-once semantics

2018-10-12 Thread Guozhang Wang
Hello Tomoyuki, 1. Seems a good use case for Streams. 2. You should actually set it to Long.MAX_VALUE to indicate "not commit regularly by intervals", setting to "0" will actually mean to commit every time. Then you can leverage on ProcessorContext.commit() to manually commit after the batch is do

Batch processing with Kafka Streams with at-least-once semantics

2018-10-10 Thread Tomoyuki Saito
Hi, I'm exploring whether it is possible to use Kafka Streams for batch processing with at-least-once semantics. What I want to do is to insert records in an external storage in bulk, and execute offset-commit after the bulk insertion to achieve at-least-once semantics. A processing topology can