Hi Guozhang Thanks. Assuming the checkpoint would typically be behind the offset persisted in my store (+ changelog), when the messages are replayed starting from the checkpoint, I can very well skip those by comparing against the offset in my store right? So I am not understanding why duplicates would affect my state.
Regards Sab On Fri, Apr 15, 2016 at 10:07 PM, Guozhang Wang <wangg...@gmail.com> wrote: > Hi Sab, > > For stateful processing where you have persistent state stores, you need to > maintain the checkpoint which includes the committed offsets as well as the > store flushed in sync, but right not these two operations are not done > atomically, and hence if you fail in between, you could still get > duplicates where you consume from the committed offsets while some of them > have already updated the stores. > > Guozhang > > > On Thu, Apr 14, 2016 at 11:56 PM, Sasidharan, Sabarish < > sabarish.sasidha...@harman.com> wrote: > > > Hi > > > > To achieve exactly once processing for my aggregates, wouldn’t it be > > enough if I maintain the latest offset processed for the aggregate and > > check against that offset when messages are replayed on recovery? Am I > > missing something here? > > > > Thanks > > > > Regards > > Sab > > > > > -- > -- Guozhang >