Re: Question on changelog partition mapping

2016-09-09 Thread Tommy Becker
Done. https://issues.apache.org/jira/browse/SAMZA-1012 On 08/26/2016 06:57 PM, Yi Pan wrote: Hi, Tommy, It is perfectly fine. Would you please open a JIRA to include this improvement? Thanks! -Yi On Fri, Aug 26, 2016 at 6:11 AM, Tommy Becker wrote: Hey Yi, Apo

Re: Question on changelog partition mapping

2016-08-26 Thread Yi Pan
Hi, Tommy, It is perfectly fine. Would you please open a JIRA to include this improvement? Thanks! -Yi On Fri, Aug 26, 2016 at 6:11 AM, Tommy Becker wrote: > Hey Yi, > > Apologies for the lateness of my reply. Yeah that makes sense, and we can > certainly implement. Would you consider accepti

Re: Question on changelog partition mapping

2016-08-26 Thread Tommy Becker
Hey Yi, Apologies for the lateness of my reply. Yeah that makes sense, and we can certainly implement. Would you consider accepting a PR that makes this change to the standard groupers? It's just strange that the generated partition mappings can vary like this, even for identical inputs. -Tom

Re: Question on changelog partition mapping

2016-08-16 Thread Yi Pan
Hi, Tommy, Yes. Now I understand what you referred to as "non-determinism". The design of the JobCoordinator has the thought that if "no-previous run is found, we are free to start from scratch" in mind. I think the current solution that you can try is to implement a grouper that will guarantee t

Re: Question on changelog partition mapping

2016-08-12 Thread Tommy Becker
Hi Yi, Thanks for the response. We are running Samza 0.9.1, so we do not yet have the coordinator stream. But to answer your other questions, the number of task instances did not change. Specifically, none of the input topic, the number of partitions in that topic, nor the grouper algorithm ch

Re: Question on changelog partition mapping

2016-08-11 Thread Yi Pan
Hi, Tommy, Which version of Samza are you using? Since 0.10, the changelog partition mapping has been moved to the coordinator stream, not in the checkpoint topic any more. That said, I want to ask a few more questions to understand what you referred to as "non-deterministic" behavior. So, betwee

Question on changelog partition mapping

2016-08-11 Thread Tommy Becker
We recently had an issue that caused us to lose the contents of one of our Samza job's checkpoint topics. We were not that concerned about losing the checkpointed offsets and so we restarted the job. We then started seeing some very strange results and were able to trace it back to the fact tha