----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/43350/#review118346 -----------------------------------------------------------
samza-yarn/src/main/java/org/apache/samza/job/yarn/AbstractContainerAllocator.java (line 103) <https://reviews.apache.org/r/43350/#comment179572> Add a line in the API - saying,the preferred Host to run the container or ANY_HOST if there are no host preferences. samza-yarn/src/main/java/org/apache/samza/job/yarn/AbstractContainerAllocator.java (line 121) <https://reviews.apache.org/r/43350/#comment179571> The behavior is to keep retrying indefinitely here, until allocation succeeds on ANY_HOST, right? samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerAllocator.java (line 53) <https://reviews.apache.org/r/43350/#comment179566> +1 on moving this to the parent abstract class. samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerUtil.java (line 228) <https://reviews.apache.org/r/43350/#comment179574> Great work on separating out various methods :) Isn't it cleaner for the API to look like: getEscapedEnvVariables(CmdBuilder builder); I'm not sure about containerID being in the API. If it is merely for logging purpose, can the caller print the containerID? Ideally, I imagine this method to be a part of the builder sub-class or some Util class unrelated to Yarn. samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaAppState.java (line 194) <https://reviews.apache.org/r/43350/#comment179575> It's debatable, maybe getRunningSamzaContainerID () is a better name for this public API? samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaContainerStartException.java (line 20) <https://reviews.apache.org/r/43350/#comment179567> nit: prefer to add some java doc comments here. "A wrapper class for all exceptions thrown during a SamzaContainer launch." samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaContainerStartException.java (line 21) <https://reviews.apache.org/r/43350/#comment179568> nit: Can we rename the class to be SamzaContainerLaunchException. I think it may be more accurate? thoughts? - Jagadish Venkatraman On Feb. 9, 2016, 1:10 a.m., Jake Maes wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/43350/ > ----------------------------------------------------------- > > (Updated Feb. 9, 2016, 1:10 a.m.) > > > Review request for samza, Navina Ramesh, Jagadish Venkatraman, and Yi Pan > (Data Infrastructure). > > > Repository: samza > > > Description > ------- > > * Throw a SamzaContainerStartException whenever the container start request > to the NM fails. > * Catch the exception in the Allocator and re-request for a container on > ANY-HOST > * Release the failed container > * Also, refactored some methods in the ContainerUtil and ContainerAllocators > to simplify the code. > > > Diffs > ----- > > > samza-yarn/src/main/java/org/apache/samza/job/yarn/AbstractContainerAllocator.java > 9ee2daccc3c44202308637207a084def81b49c09 > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerAllocator.java > 7c57a866114aabf76a36d3b7ca4c5810628e0c77 > > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerRequestState.java > ab3061eae2cfc2da6681ce2034492b165d0d8b96 > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerUtil.java > 91fae98c074e1648e7168fb8e76d6e1e656816fc > > samza-yarn/src/main/java/org/apache/samza/job/yarn/HostAwareContainerAllocator.java > ff22dbfe5537ba2a3c55c4e6063680ff4c9e55f4 > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaAppState.java > bc5b606394351e4039d084914fe5898e6ff148a1 > > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaContainerStartException.java > PRE-CREATION > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaTaskManager.java > a3562a1155cbf24989200063b3ebd472b295db37 > > samza-yarn/src/test/java/org/apache/samza/job/yarn/TestContainerAllocator.java > e2b45d7577dea5a4a71af22521c93a7fd75eaefc > > samza-yarn/src/test/java/org/apache/samza/job/yarn/TestHostAwareContainerAllocator.java > 269d82479650b3bc2890d250da0391d34104b1eb > > samza-yarn/src/test/java/org/apache/samza/job/yarn/util/MockContainerListener.java > 8fc0b9898ed4484b9db98da7f622ab61f35cd4b1 > > samza-yarn/src/test/java/org/apache/samza/job/yarn/util/MockContainerRequestState.java > e7441e512e73b5d09740f252dc52efece640bea4 > > samza-yarn/src/test/java/org/apache/samza/job/yarn/util/MockContainerUtil.java > 4426ce6ac8808257dc00df50452df3d7555d6d0b > > samza-yarn/src/test/java/org/apache/samza/job/yarn/util/TestAMRMClientImpl.java > 951e0f9504b87263eba891d367c9806248241c7d > > Diff: https://reviews.apache.org/r/43350/diff/ > > > Testing > ------- > > Added 2 unit tests and I'm doing some manual testing in parallel with this > review. The manual test is to kill the NM for one of the containers of a job > that uses Host Affinity. When the RM detects the outage, it should notify the > AM which will try to restart the container on the same host. It will get a > connection error and at that point should retry on a DIFFERENT host. > > > Thanks, > > Jake Maes > >