Hi Xinyu, Apologies, looks like I sent this from my Li email and was not updated, let me send it again from my personal email.
Thanks for the feedback, here are my comments: The title of the SEP is Container Placements in Samza, which sounds like > it's going to reimplement the container placements logic currently in > ClusterBasedJobCoordinator. After reading the proposal, I think it should > be renamed to be a much smaller scope, e.g. Container Relocation Tool. Container Placements title suggests building the ability to move/restart containers of a job on a same or different host, the external tool moving containers is one user of it, other users can be a Load balancer controller (Cruise control like system), ASC controller resizing containers with different sizes Some other abilities are: 1. It will also give as an ability to override already existing locality when one or partial containers are down with job running in a degraded state 2. Spin up & place a standby for one or few containers Hence I feel the naming is justified - Goal: the goal is to build a tool, right? I want to be crystal clear on > that. After the solution, I think there will be an tool to do container > relocation for Samza open source. Build the ability doesn't mean much to > Samza open source users I believe. Sure the Goals section mention "Expose these APIs via a tool or dashboard for operability", I can add more emphasis on it. At the same time, open-source users can use these API in their custom tools for addressing Operability - Proposed Solution: after reading this I think the rejected the option is > to write to metadastore, and the accepted one is to develop a container > placement service. But then I read the architecture part, the proposal > says > the preferred approach is to use Samza metastore API to read and write the > container placement metadata. It feels contradictory to me. Seems the > metadatsstore is a must-have piece in this solution, so it might be cleaner > to remove the first solution. Nice catch, the first approach is actually directly modifying the locality mapping in the metadata store (coordinator stream) as compared to the other approach where we use metadata store to read/write Container placement request/response under a *separate metastore namespace. *Let me add these details so others don't get confused - I don't understand the "service" part of the container placement. Is it a > separate service that can be hosted somewhere? Is it based on jetty or > netty? what's the communication protocol? From reading the proposal, it > looks like a set of API instead of service. If that's the case, please > remove all the usage of "container placement service" in this proposal. Its referred to as a service since there is a listener of events running in the JobCoordinator (ContainerPlacementHandler) which relays the container placement request messages to the JobCoordinator, metastore is just used as a communication channel which we can logically compare with a REST Service which where we have an event handler listening on a port for request. The communication protocol for request/response messages is defined by us using metastore (Kafka) as a communication channel. It's not an independent service that can be hosted somewhere but more an embedded one. If can think of a better alternative for this & follow up. - key-value format: the description of the uuid of the > ContainerPlacementRequestMessage has a typo I think (response -> request). Corrected, thanks! - The diagram of the components interaction look nice, and definitely helps > me understand what the solution will look like. I didn't find a description > about writing the new locality information. I am interested in the ordering > of that regarding to the ordering of reserving resource and running the > container. I feel that part will be a bit complicated. Nice catch, I added this section to the doc: *Who writes the new locality mapping after a successful move?* Samza container on a successful start write their new locality message to the metadata store (code <https://github.com/apache/samza/blob/master/samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala#L749>), hence after a successful move container writes its new locality - public interfaces: are these interfaces intended to by used by samza > users? How do they use them? Is there an example? Are these interfaces > going into samza-api? Are those interfaces stable? Why the uuid in the > interface cannot be generated by Samza itself? Why Let me add an example to use them and add two-section Public Interfaces containing (ContainerPlacementMessage, ContainerPlacementRequestMessage, and ContainerPlacementResponseMessage) with the interface stability evolving. UUID is generated by samza itself, please check ContainerPlacementMetadataStore#writeContainerPlacementRequestMessage. In terms of usage please see the answer to the next question I added a sample on SEP showing use of these interfaces: > ContainerPlacementMetadatstore is part of the API? should that be hidden > from the users? If the goal is to provide a tool, I would rather make these > classes internal in samza-core and also describe how the tool will look > like here. ContainerPlacementMetadatstore is already part of samza core and I added another section giving a sample usage: *Example Usage:* place-container --deployment-id 1581635852024-5117e303 --app-name snjain- test-cp --app.id = 1 --processor-id 4 --request-expiry 10 --destination-host abc.prod.com *Example Tool:* @CommandLine.Command(name = "place-container", description = "Request to move/restart container at destination-host") public class ContainerPlacementTool { ... _appName = // read from commandline _appId = // read from commandline _deploymentId = // read from commandline _processorId = // read from commandline _destinationHost = // read from commandline _requestExpiry = // read from commandline MetadataStore metadataStore = buildMetadataStore(_appName, _appId); try { ContainerPlacementMetadataStore containerPlacementMetadataStore = new ContainerPlacementMetadataStore(metadataStore); containerPlacementMetadataStore.start(); Duration requestExpiry = _requestExpiry != null ? Duration.ofSeconds(_requestExpiry) : null; UUID uuid = containerPlacementMetadataStore.writeContainerPlacementRequestMessage(_deploymentId, _processorId, _destinationHost, _requestExpiry, System.currentTimeMillis()); System.out.println("Request received query the status using: " + uuid); } finally { metadataStore.close(); } } Thanks Sanil On Thu, 13 Feb 2020 at 11:19, Xinyu Liu <xinyuliu...@gmail.com> wrote: > Thanks for writing a very detailed design doc. My feedback is below: > > - The title of the SEP is Container Placements in Samza, which sounds like > it's going to reimplement the container placements logic currently in > ClusterBasedJobCoordinator. After reading the proposal, I think it should > be renamed to be a much smaller scope, e.g. Container Relocation Tool. > - Goal: the goal is to build a tool, right? I want to be crystal clear on > that. After the solution, I think there will be an tool to do container > relocation for Samza open source. Build the ability doesn't mean much to > Samza open source users I believe. > - Proposed Solution: after reading this I think the rejected the option is > to write to metadastore, and the accepted one is to develop a container > placement service. But then I read the architecture part, the proposal says > the preferred approach is to use Samza metastore API to read and write the > container placement metadata. It feels contradictory to me. Seems the > metadatsstore is a must-have piece in this solution, so it might be cleaner > to remove the first solution. > - I don't understand the "service" part of the container placement. Is it a > separate service that can be hosted somewhere? Is it based on jetty or > netty? what's the communication protocol? From reading the proposal, it > looks like a set of API instead of service. If that's the case, please > remove all the usage of "container placement service" in this proposal. > - key-value format: the description of the uuid of the > ContainerPlacementRequestMessage has a typo I think (response -> request). > - The diagram of the components interaction look nice, and definitely helps > me understand what the solution will look like. I didn't find a description > about writing the new locality information. I am interested in the ordering > of that regarding to the ordering of reserving resource and running the > container. I feel that part will be a bit complicated. > - public interfaces: are these interfaces intended to by used by samza > users? How do they use them? Is there an example? Are these interfaces > going into samza-api? Are those interfaces stable? Why the uuid in the > interface cannot be generated by Samza itself? Why > ContainerPlacementMetadatstore is part of the API? should that be hidden > from the users? If the goal is to provide a tool, I would rather make these > classes internal in samza-core and also describe how the tool will look > like here. > > Thanks, > Xinyu > > > > On Wed, Jan 15, 2020 at 10:34 AM Sanil Jain <sanil.jai...@gmail.com> > wrote: > > > Hi Prateek, > > > > Thanks for the feedback, I have updated the SEP with your suggestions. > > Please have a look. > > > > Thank you > > Sanil > > > > On Mon, 6 Jan 2020 at 10:54, Sanil Jain <sanil.jai...@gmail.com> wrote: > > > > > Hi all, > > > > > > Refreshing this back to see if there is any further feedback on this > SEP. > > > > > > Thanks > > > Sanil > > > > > > On Thu, 5 Dec 2019 at 16:25, Sanil Jain <sanil.jai...@gmail.com> > wrote: > > > > > >> Hi all, > > >> > > >> I have created SEP-22: Container Placements in Samza, which adds > > >> abilities to move containers for a Samza job seamlessly from one host > to > > >> another without restarting jobs. > > >> > > >> Please find the SEP wiki below: > > >> > > > https://cwiki.apache.org/confluence/display/SAMZA/SEP-22%3A+Container+Placements+in+Samza > > >> > > >> Please take a look and chime in your feedback. > > >> > > >> Thanks > > >> Sanil > > >> > > > > > >