> On Feb. 17, 2016, 5:34 p.m., Yi Pan (Data Infrastructure) wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerAllocator.java, > > line 57 > > <https://reviews.apache.org/r/43350/diff/3/?file=1239498#file1239498line57> > > > > This is a little subjective. My opinion on this one is: if we are > > rolling the containerRequestState usage within the base class (as this > > runContainer() method is doing), we would better wrap it all. Not being > > able to see the state read/write pattern in a single file is painful when > > something went wrong. We can keep the runContainer() method but leave the > > state update outside.
My feeling is that if we extract the state update from the method, it would defeat the purpose. The goals were to: 1. Avoid the triplicate code 2. Tie the state update and running the container together, you should never do one without the other. - Jake ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/43350/#review119488 ----------------------------------------------------------- On Feb. 10, 2016, 3:57 p.m., Jake Maes wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/43350/ > ----------------------------------------------------------- > > (Updated Feb. 10, 2016, 3:57 p.m.) > > > Review request for samza, Navina Ramesh, Jagadish Venkatraman, and Yi Pan > (Data Infrastructure). > > > Repository: samza > > > Description > ------- > > * Throw a SamzaContainerStartException whenever the container start request > to the NM fails. > * Catch the exception in the Allocator and re-request for a container on > ANY-HOST > * Release the failed container > * Also, refactored some methods in the ContainerUtil and ContainerAllocators > to simplify the code. > > > Diffs > ----- > > > samza-yarn/src/main/java/org/apache/samza/job/yarn/AbstractContainerAllocator.java > 9ee2daccc3c44202308637207a084def81b49c09 > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerAllocator.java > 7c57a866114aabf76a36d3b7ca4c5810628e0c77 > > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerRequestState.java > ab3061eae2cfc2da6681ce2034492b165d0d8b96 > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerUtil.java > 91fae98c074e1648e7168fb8e76d6e1e656816fc > > samza-yarn/src/main/java/org/apache/samza/job/yarn/HostAwareContainerAllocator.java > ff22dbfe5537ba2a3c55c4e6063680ff4c9e55f4 > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaAppState.java > bc5b606394351e4039d084914fe5898e6ff148a1 > > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaContainerLaunchException.java > PRE-CREATION > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaTaskManager.java > a3562a1155cbf24989200063b3ebd472b295db37 > > samza-yarn/src/test/java/org/apache/samza/job/yarn/TestContainerAllocator.java > e2b45d7577dea5a4a71af22521c93a7fd75eaefc > > samza-yarn/src/test/java/org/apache/samza/job/yarn/TestHostAwareContainerAllocator.java > 269d82479650b3bc2890d250da0391d34104b1eb > > samza-yarn/src/test/java/org/apache/samza/job/yarn/util/MockContainerListener.java > 8fc0b9898ed4484b9db98da7f622ab61f35cd4b1 > > samza-yarn/src/test/java/org/apache/samza/job/yarn/util/MockContainerRequestState.java > e7441e512e73b5d09740f252dc52efece640bea4 > > samza-yarn/src/test/java/org/apache/samza/job/yarn/util/MockContainerUtil.java > 4426ce6ac8808257dc00df50452df3d7555d6d0b > > samza-yarn/src/test/java/org/apache/samza/job/yarn/util/TestAMRMClientImpl.java > 951e0f9504b87263eba891d367c9806248241c7d > > Diff: https://reviews.apache.org/r/43350/diff/ > > > Testing > ------- > > Added 2 unit tests and I'm doing some manual testing in parallel with this > review. The manual test is to kill the NM for one of the containers of a job > that uses Host Affinity. When the RM detects the outage, it should notify the > AM which will try to restart the container on the same host. It will get a > connection error and at that point should retry on a DIFFERENT host. > > > Thanks, > > Jake Maes > >