The proposal looks great to me! Changing the id type to string will make sure this can work with other types of cluster which doesn't support integer id. The interface and config provides a pluggable way to have different id generators for different use cases. One minor thing I found is that the name of the config is camel case (*processor.idGenerator.class*). Seems Samza's practice is to use all lower case configs with "." delimiter. Do you think we should stick to this convention?
One more suggestion is to have a static factory method in the ProcessorIdGenerator (Like what we have in ApplicationRunner): static ProcessIdGenerator fromConfig(Config config) { ... }. With this, It will be more convenient for the ApplicationRunner to construct the generator. What do you think? Thanks, Xinyu On Wed, Mar 15, 2017 at 10:59 PM, Navina Ramesh (Apache) <nav...@apache.org> wrote: > Hi everyone, > I created a proposal for SAMZA-1126, which addresses the semantics of > ProcessorId in Samza. For most purposes, ProcessorId is same as the logical > id that Samza assigns for each Yarn container. It is primarily used in > JobModel as a key for the corresponding ContainerModel and also, in > container-level metrics. We are expanding the applicability of processorId > to be beyond a fixed set of processors. > > Please review and comment on this SEP. > > For those who are not actively following the master branch, you may have > more questions than others. Feel free to ask them here. > > @Xinyu: Since you are working on SAMZA-1067 and other related integration > APIs, can you please add an SEP for SAMZA-1067 ? This will help others (adn > me as well) get on the same page with your design/code. Let me know if > SEP-1 will work per your design for ApplicationRunner. > > Thanks! > Navina >