Hi Yi, I got it now, thanks for your help!
———————— Qi Shu > 在 2017年5月18日,05:22,Yi Pan <nickpa...@gmail.com> 写道: > > Hi, Qi, > > This would depend on the following two factors: > # whether the send() is async or sync > # how do you handle the send failure > > If the send() is sync, you will always receive an exception in your > process() method when MessageCollector.send() is invoked. Hence, if your > code does not handle the exception, it would be thrown out to the RunLoop > and the whole container will fail. If your code captures the exception, it > is then up to your application logic to deal with the send failure (i.e. > user will need to choose either ignore the send failure and proceed, or > fail and stop). If you choose to not ignore the send failures, then in this > case, the checkpoint will not proceed beyond the input that caused the send > failures, and the container will restart with the previous checkpoint, > which does not cause data loss. > > If the send() is async, the commit procedure in RunLoop will make sure to > flush all pending sends before checkpointing. If the flush fails, the > exception will be thrown out and the container will fail. Hence, when > restarted, the container will repeat from the previous checkpoint (i.e. at > least once delivery still holds and no data loss). > > Hope the above answers your question. > > Thanks! > > -Yi > > On Thu, May 11, 2017 at 12:43 AM, 舒琦 <sh...@eefung.com> wrote: > >> Hi Jagadish, >> >> I may not express my questions clearly. >> >> Here is what I want to know. When MessageCollector.send is called >> in process method, if sending fail and fail again, under this situation is >> it possible to cause data loss ( continue to fetch and process messages, >> but can’t send them out, at the same time offset is still forwarding and >> checkpointing ). >> >> Thanks very much. >> >> ———————— >> Qi Shu >> >>> 在 2017年5月11日,15:35,Jagadish Venkatraman <jagadish1...@gmail.com> 写道: >>> >>> Hi Qi, >>> >>>>> If one record can’t be sent out all the time, then the consumer >>> will still fetch messages or not, and what about the offset >> checkpointing? >>> >>> Polling / fetching messages from the consumer (in case of Kafka) happens >> in >>> a separate thread. >>> >>> Samza offers an at-least once processing guarantee with zero data loss. >>> >>> I'm not sure I understand your specific question about checkpointing? >>> >>> >>> On Thu, May 11, 2017 at 12:28 AM, 舒琦 <sh...@eefung.com> wrote: >>> >>>> Hi, >>>> >>>> Below is the description about checkpointing. >>>> >>>> 『Checkpointing is guaranteed to only cover events that are fully >>>> processed. It happens only when there are no pending >>>> process()/processAsync() or WindowableTask.window() invocations. All >>>> preceding invocations happen-before checkpointing and checkpointing >>>> happens-before all subsequent invocations.』 >>>> >>>> If one record can’t be sent out all the time, then the consumer >>>> will still fetch messages or not, and what about the offset >> checkpointing? >>>> >>>> Thanks! >>>> >>>> ———————— >>>> Qi Shu >>> >>> >>> >>> >>> -- >>> Jagadish V, >>> Graduate Student, >>> Department of Computer Science, >>> Stanford University >> >>