----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/43350/ -----------------------------------------------------------
Review request for samza, Navina Ramesh, Jagadish Venkatraman, and Yi Pan (Data Infrastructure). Repository: samza Description ------- * Throw a SamzaContainerStartException whenever the container start request to the NM fails. * Catch the exception in the Allocator and re-request for a container on ANY-HOST * Release the failed container * Also, refactored some methods in the ContainerUtil and ContainerAllocators to simplify the code. Diffs ----- samza-yarn/src/main/java/org/apache/samza/job/yarn/AbstractContainerAllocator.java 9ee2daccc3c44202308637207a084def81b49c09 samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerAllocator.java 7c57a866114aabf76a36d3b7ca4c5810628e0c77 samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerRequestState.java ab3061eae2cfc2da6681ce2034492b165d0d8b96 samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerUtil.java 91fae98c074e1648e7168fb8e76d6e1e656816fc samza-yarn/src/main/java/org/apache/samza/job/yarn/HostAwareContainerAllocator.java ff22dbfe5537ba2a3c55c4e6063680ff4c9e55f4 samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaAppState.java bc5b606394351e4039d084914fe5898e6ff148a1 samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaContainerStartException.java PRE-CREATION samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaTaskManager.java a3562a1155cbf24989200063b3ebd472b295db37 samza-yarn/src/test/java/org/apache/samza/job/yarn/TestContainerAllocator.java e2b45d7577dea5a4a71af22521c93a7fd75eaefc samza-yarn/src/test/java/org/apache/samza/job/yarn/TestHostAwareContainerAllocator.java 269d82479650b3bc2890d250da0391d34104b1eb samza-yarn/src/test/java/org/apache/samza/job/yarn/util/MockContainerListener.java 8fc0b9898ed4484b9db98da7f622ab61f35cd4b1 samza-yarn/src/test/java/org/apache/samza/job/yarn/util/MockContainerRequestState.java e7441e512e73b5d09740f252dc52efece640bea4 samza-yarn/src/test/java/org/apache/samza/job/yarn/util/MockContainerUtil.java 4426ce6ac8808257dc00df50452df3d7555d6d0b samza-yarn/src/test/java/org/apache/samza/job/yarn/util/TestAMRMClientImpl.java 951e0f9504b87263eba891d367c9806248241c7d Diff: https://reviews.apache.org/r/43350/diff/ Testing ------- Added 2 unit tests and I'm doing some manual testing in parallel with this review. The manual test is to kill the NM for one of the containers of a job that uses Host Affinity. When the RM detects the outage, it should notify the AM which will try to restart the container on the same host. It will get a connection error and at that point should retry on a DIFFERENT host. Thanks, Jake Maes