Let me explain the use case in more detail:
We are keeping the data in 2 systems in sync. Let's name the upstream
system as the "source"
and the downstream system as "destination". The destination system is
backed up (locally)
once per day (let's say at 3:00 AM).
It's now 1:00 PM, and we've been processing messages to keep the systems
in sync since 3:00 AM.
At this time, the destination system dies. A replacement is created, and
the backup taken at 3:00 AM
is used as the starting state for the new system.
We'd then want to replay all messages that occurred since 3:00.
On 03/01/2016 04:57 PM, Jagadish Venkatraman wrote:
Users need not worry about checkpointing. Samza will automatically commit
offsets every 60s. You can choose to commit more often by either
1. Setting task.commit.ms to a smaller value (or)
2. Doing manual commit yourself by setting task.commit.ms = -1. and calling
taskCoordinator.commit();
I'm curious as to Why processing from the exact previous offset is
unacceptable in your usecase?
Let's say you process till offfset 100, and crash. Should you not want to
resume from 100?
On Tue, Mar 1, 2016 at 1:41 PM, Jeff Ramin <jeff.ra...@singlewire.com>
wrote:
On 03/01/2016 03:10 PM, Jagadish Venkatraman wrote:
You don't have to implement any state checkpoint. Samza automatically
checkpoints state for you. When you recover from a failure/restart you
will
resume processing from the previous checkpoint.
So, it's merely a configuration issue?
What's your usecase?
Pretty standard: have a consumer processing messages, which dies. When it
comes back up,
it needs to process messages not just from when it died, but perhaps 24
hours prior to that time.
--
Jeff Ramin
Software Engineer
Singlewire Software
2601 W Beltline Hwy #510
Madison, WI 53713
Phone Direct - 608.661.1172
www.singlewire.com
--
Jeff Ramin
Software Engineer
Singlewire Software
2601 W Beltline Hwy #510
Madison, WI 53713
Phone Direct - 608.661.1172
www.singlewire.com