> On Feb. 23, 2016, 10:26 p.m., Xinyu Liu wrote:
> > samza-core/src/main/scala/org/apache/samza/config/TaskConfig.scala, line 97
> > <https://reviews.apache.org/r/43350/diff/5/?file=1255125#file1255125line97>
> >
> >     Usually this is done in Scala using option.getOrElse. So instead of 
> > adding a new method, we can just do
> >     
> >     config.getCommandClass.getOrElse(defaultValue).

This is a good suggestion, but in this case, the caller is written in Java and 
there isn't a concise way to pass a function to the getOrElse() method from 
Java.


- Jake


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43350/#review120393
-----------------------------------------------------------


On Feb. 23, 2016, 9:43 p.m., Jake Maes wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/43350/
> -----------------------------------------------------------
> 
> (Updated Feb. 23, 2016, 9:43 p.m.)
> 
> 
> Review request for samza, Navina Ramesh, Jagadish Venkatraman, and Yi Pan 
> (Data Infrastructure).
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> * Throw a SamzaContainerStartException whenever the container start request 
> to the NM fails.
> * Catch the exception in the Allocator and re-request for a container on 
> ANY-HOST
> * Release the failed container
> * Also, refactored some methods in the ContainerUtil and ContainerAllocators 
> to simplify the code.
> * Finally, fix a logic bug in the HostAwareContainerAllocator wherein the 
> expiration logic was flipped.
> 
> 
> Diffs
> -----
> 
>   samza-core/src/main/scala/org/apache/samza/config/TaskConfig.scala 
> 51e9e99e506c065938b18d081df6fe1ff24f79ec 
>   
> samza-yarn/src/main/java/org/apache/samza/job/yarn/AbstractContainerAllocator.java
>  2e192eecf412525f45846edb62b23d261d35cbda 
>   samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerAllocator.java 
> 31fcc572d499d4e17cdf1340d7072c1562ccfdb7 
>   
> samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerRequestState.java 
> 54db5e5b2b1b109d202e814809adbfd2bc84fb4b 
>   samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerUtil.java 
> 91fae98c074e1648e7168fb8e76d6e1e656816fc 
>   
> samza-yarn/src/main/java/org/apache/samza/job/yarn/HostAwareContainerAllocator.java
>  8e1db77db74e1dc01bb0ec64e102befd225123e9 
>   samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaAppState.java 
> bc5b606394351e4039d084914fe5898e6ff148a1 
>   
> samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaContainerLaunchException.java
>  PRE-CREATION 
>   samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaTaskManager.java 
> a3562a1155cbf24989200063b3ebd472b295db37 
>   
> samza-yarn/src/test/java/org/apache/samza/job/yarn/TestContainerAllocator.java
>  e2b45d7577dea5a4a71af22521c93a7fd75eaefc 
>   
> samza-yarn/src/test/java/org/apache/samza/job/yarn/TestHostAwareContainerAllocator.java
>  269d82479650b3bc2890d250da0391d34104b1eb 
>   
> samza-yarn/src/test/java/org/apache/samza/job/yarn/TestSamzaTaskManager.java 
> 88d9f24d16fc3d9842b387cfc22edaf1dfa6fd06 
>   
> samza-yarn/src/test/java/org/apache/samza/job/yarn/util/MockContainerAllocator.java
>  5fcad82fc10f87510e994896ed7380f1e636a18f 
>   
> samza-yarn/src/test/java/org/apache/samza/job/yarn/util/MockContainerListener.java
>  8fc0b9898ed4484b9db98da7f622ab61f35cd4b1 
>   
> samza-yarn/src/test/java/org/apache/samza/job/yarn/util/MockContainerRequestState.java
>  e7441e512e73b5d09740f252dc52efece640bea4 
>   
> samza-yarn/src/test/java/org/apache/samza/job/yarn/util/MockContainerUtil.java
>  4426ce6ac8808257dc00df50452df3d7555d6d0b 
>   
> samza-yarn/src/test/java/org/apache/samza/job/yarn/util/TestAMRMClientImpl.java
>  951e0f9504b87263eba891d367c9806248241c7d 
> 
> Diff: https://reviews.apache.org/r/43350/diff/
> 
> 
> Testing
> -------
> 
> Added 2 unit tests and I'm doing some manual testing in parallel with this 
> review. The manual test is to kill the NM for one of the containers of a job 
> that uses Host Affinity. When the RM detects the outage, it should notify the 
> AM which will try to restart the container on the same host. It will get a 
> connection error and at that point should retry on a DIFFERENT host.
> 
> The testing was successful for both scenarios. 
> 1. RM is unaware that some NMs are down when job is starting:
>    a. RM attempts to start AM on hosts, retrying up to the limit (50) in the 
> case of connection errors to the NMs
>    b. AM requests containers (accounting for host affinity) and tries to 
> start the containers
>    c. YarnClient retries connection errors (an excessive number of times) and 
> then throws to the AM
>    d. AM releases the bad container and requests a new one on ANY_HOST
>    e. Container is eventually started on a good host. 
> 2. RM is unaware that some NMs are down when job is already running:
>    a. NM-RM heartbeat interval expires
>    b. RM is aware of the missing NMs and containers
>    c. RM notifies AM of the completed containers
>    d. AM requests new containers which the RM allocates only on good hosts 
> (since it is already aware of the bad hosts)
>    e. Container is started on a good host.
>    
> In both cases, the job stays alive and the full set of containers is 
> eventually running on good hosts. Also, the state counters, like 
> "neededContainers" are correct throughout the test.
> 
> 
> Thanks,
> 
> Jake Maes
> 
>

Reply via email to