Thanks Jacob.

Regarding 2) below - is there a way to reprocess messages from an arbitrary position,
instead of from the beginning?


On 03/01/2016 06:32 PM, Jacob Maes wrote:
A couple notes that may be helpful:

1. When you have a stateful processor that dies, the changelog is the
default means by which the state is restored. Change logging is enabled
with this config:
stores.store-name.changelog

2. If, when the job comes back up, it needs to reprocess historical
messages, it sounds like you actually don't want checkpoints, but you want
to rewind to the beginning of the topic. You can achieve this with the
following configs
systems.system-name.streams.stream-name.samza.reset.offset = true
systems.system-name.streams.stream-name.samza.offset.default = oldest
and possibly
systems.system-name.streams.stream-name.samza.bootstrap = true   // read
the doc on this one to decide if you need it

http://samza.apache.org/learn/documentation/0.10/jobs/configuration-table.html

On Tue, Mar 1, 2016 at 2:57 PM, Jagadish Venkatraman <jagadish1...@gmail.com
wrote:
Users need not worry about checkpointing. Samza will automatically commit
offsets every 60s. You can choose to commit more often by either
1. Setting task.commit.ms to a smaller value (or)
2. Doing manual commit yourself by setting task.commit.ms = -1. and
calling
taskCoordinator.commit();

I'm curious as to Why processing from the exact previous offset is
unacceptable in your usecase?

Let's say you process till offfset 100, and crash. Should you not want to
resume from 100?







On Tue, Mar 1, 2016 at 1:41 PM, Jeff Ramin <jeff.ra...@singlewire.com>
wrote:


On 03/01/2016 03:10 PM, Jagadish Venkatraman wrote:

You don't have to implement any state checkpoint. Samza automatically
checkpoints state for you. When you recover from a failure/restart you
will
resume processing from the previous checkpoint.

So, it's merely a configuration issue?

   What's your usecase?
Pretty standard: have a consumer processing messages, which dies. When it
comes back up,
it needs to process messages not just from when it died, but perhaps 24
hours prior to that time.


--
Jeff Ramin
Software Engineer
Singlewire Software
2601 W Beltline Hwy #510
Madison, WI 53713

Phone Direct - 608.661.1172
www.singlewire.com



--
Jagadish V,
Graduate Student,
Department of Computer Science,
Stanford University


--
Jeff Ramin
Software Engineer
Singlewire Software
2601 W Beltline Hwy #510
Madison, WI 53713

Phone Direct - 608.661.1172
www.singlewire.com

Reply via email to