Hi, Tommy, Which version of Samza are you using? Since 0.10, the changelog partition mapping has been moved to the coordinator stream, not in the checkpoint topic any more.
That said, I want to ask a few more questions to understand what you referred to as "non-deterministic" behavior. So, between the job restarts, did the total number of tasks change? As you have observed, the total number of partitions in a changelog topic is equivalent to the total number of tasks in a job. And the reasons for the total number of tasks to change include: - the input topic partition changed - the grouper algorithm changed In both cases, the states are no longer considered valid, since data may have been shuffled between the Kafka partitions, or between the tasks already. Could you clarify whether you saw the "non-determinism" w/ or w/o the total number of tasks changed? Thanks! -Yi On Thu, Aug 11, 2016 at 11:56 AM, Tommy Becker <tobec...@tivo.com> wrote: > We recently had an issue that caused us to lose the contents of one of our > Samza job's checkpoint topics. We were not that concerned about losing the > checkpointed offsets and so we restarted the job. We then started seeing > some very strange results and were able to trace it back to the fact that > changelog paritition mapping changed. We were unaware this data was stored > in the checkpoint topic. Can someone explain why this mapping is necessary? > I was under the impression that the number of changelog partitions is > identical to the number of task instances. If this is so, can't partitions > just be assigned based on the task number? Assuming the mapping is > necessary, it would be nice if it was deterministic. Looking at > JobCoordinator, it seems to be dependent on the order in which things come > back in the map produced by the SystemStreamPartitionGrouper. This > non-determinism seems to have been the cause of our issues. Obviously data > loss is a problem, but it seems like Samza could have recreated the > original mapping. Should I file a bug on this? > > -- > Tommy Becker > Senior Software Engineer > > Digitalsmiths > A TiVo Company > > www.digitalsmiths.com<http://www.digitalsmiths.com> > tobec...@tivo.com<mailto:tobec...@tivo.com> > > ________________________________ > > This email and any attachments may contain confidential and privileged > material for the sole use of the intended recipient. Any review, copying, > or distribution of this email (or any attachments) by others is prohibited. > If you are not the intended recipient, please contact the sender > immediately and permanently delete this email and any attachments. No > employee or agent of TiVo Inc. is authorized to conclude any binding > agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo > Inc. may only be made by a signed written agreement. >