> On Feb. 9, 2016, 2:18 a.m., Jagadish Venkatraman wrote:
> > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerUtil.java, line 
> > 261
> > <https://reviews.apache.org/r/43350/diff/1/?file=1238001#file1238001line261>
> >
> >     Great work on separating out various methods :)
> >     
> >     Isn't it cleaner for the API to look like:
> >     getEscapedEnvVariables(CmdBuilder builder);
> >     
> >     I'm not sure about containerID being in the API. If it is merely for 
> > logging purpose, can the caller print the containerID?
> >     
> >     Ideally, I imagine this method to be a part of the builder sub-class or 
> > some Util class unrelated to Yarn.

I debated about this. You're right, it would be nicer to have the 
commandBuilder expose this method, but there are a number of reasons this is 
more of a headache than it's worth:
1. If we put this logic in ShellCommandBuilder, it will need to be replicated 
for any user-defined CommandBuilders too.
2. If we put it in the parent class (which is in the samza-api module) then we 
need access to Util.envVarEscape() in samza-api and I don't think that's the 
right place for it
3. I could create an AbstractCommandBuilder that is in samza-core and 
implements this method, but there's no guarantee the users will extend it, and 
ContainerUtil depends on it for correctness. 

So, unless there are any objections, I'm keeping the code in ContainerUtil, but 
refactoring it so the method signatures make more sense.


> On Feb. 9, 2016, 2:18 a.m., Jagadish Venkatraman wrote:
> > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaAppState.java, line 
> > 194
> > <https://reviews.apache.org/r/43350/diff/1/?file=1238003#file1238003line194>
> >
> >     It's debatable, maybe getRunningSamzaContainerID () is a better name 
> > for this public API?

Haha, I went back and fourth on this too. Is it primarily a running-container 
or a samza-container? Since you've voiced an opinion, I'll sway your way. :-)


> On Feb. 9, 2016, 2:18 a.m., Jagadish Venkatraman wrote:
> > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaContainerStartException.java,
> >  line 20
> > <https://reviews.apache.org/r/43350/diff/1/?file=1238004#file1238004line20>
> >
> >     nit: prefer to add some java doc comments here. 
> >     "A wrapper class for all exceptions thrown during a SamzaContainer 
> > launch."

You caught me. :-)
Added JavaDoc.


> On Feb. 9, 2016, 2:18 a.m., Jagadish Venkatraman wrote:
> > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaContainerStartException.java,
> >  line 21
> > <https://reviews.apache.org/r/43350/diff/1/?file=1238004#file1238004line21>
> >
> >     nit: Can we rename the class to be SamzaContainerLaunchException. I 
> > think it may be more accurate? thoughts?

No preference here. I'll rename this and the method in ContainerUtil for 
consistency.


> On Feb. 9, 2016, 2:18 a.m., Jagadish Venkatraman wrote:
> > samza-yarn/src/main/java/org/apache/samza/job/yarn/AbstractContainerAllocator.java,
> >  line 103
> > <https://reviews.apache.org/r/43350/diff/1/?file=1237998#file1237998line103>
> >
> >     Add a line in the API - saying,the preferred Host to run the container 
> > or ANY_HOST if there are no host preferences.

Good catch. Done.


> On Feb. 9, 2016, 2:18 a.m., Jagadish Venkatraman wrote:
> > samza-yarn/src/main/java/org/apache/samza/job/yarn/AbstractContainerAllocator.java,
> >  line 121
> > <https://reviews.apache.org/r/43350/diff/1/?file=1237998#file1237998line121>
> >
> >     The behavior is to keep retrying indefinitely here, until allocation 
> > succeeds on ANY_HOST, right?

Yes. The idea is that eventually, we should find a host on which we can launch 
the container (note, for the purposes of this code, we don't know or care what 
happens immediately after launch, just that we were able to instruct the NM to 
launch it). 

If we can't reach any node on the cluster, there's something horribly wrong and 
our (AM's) container is likely to go down next. 

If the necessary resources aren't available, the RM won't allocate us another 
container.


- Jake


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43350/#review118346
-----------------------------------------------------------


On Feb. 9, 2016, 1:10 a.m., Jake Maes wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/43350/
> -----------------------------------------------------------
> 
> (Updated Feb. 9, 2016, 1:10 a.m.)
> 
> 
> Review request for samza, Navina Ramesh, Jagadish Venkatraman, and Yi Pan 
> (Data Infrastructure).
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> * Throw a SamzaContainerStartException whenever the container start request 
> to the NM fails.
> * Catch the exception in the Allocator and re-request for a container on 
> ANY-HOST
> * Release the failed container
> * Also, refactored some methods in the ContainerUtil and ContainerAllocators 
> to simplify the code.
> 
> 
> Diffs
> -----
> 
>   
> samza-yarn/src/main/java/org/apache/samza/job/yarn/AbstractContainerAllocator.java
>  9ee2daccc3c44202308637207a084def81b49c09 
>   samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerAllocator.java 
> 7c57a866114aabf76a36d3b7ca4c5810628e0c77 
>   
> samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerRequestState.java 
> ab3061eae2cfc2da6681ce2034492b165d0d8b96 
>   samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerUtil.java 
> 91fae98c074e1648e7168fb8e76d6e1e656816fc 
>   
> samza-yarn/src/main/java/org/apache/samza/job/yarn/HostAwareContainerAllocator.java
>  ff22dbfe5537ba2a3c55c4e6063680ff4c9e55f4 
>   samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaAppState.java 
> bc5b606394351e4039d084914fe5898e6ff148a1 
>   
> samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaContainerStartException.java
>  PRE-CREATION 
>   samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaTaskManager.java 
> a3562a1155cbf24989200063b3ebd472b295db37 
>   
> samza-yarn/src/test/java/org/apache/samza/job/yarn/TestContainerAllocator.java
>  e2b45d7577dea5a4a71af22521c93a7fd75eaefc 
>   
> samza-yarn/src/test/java/org/apache/samza/job/yarn/TestHostAwareContainerAllocator.java
>  269d82479650b3bc2890d250da0391d34104b1eb 
>   
> samza-yarn/src/test/java/org/apache/samza/job/yarn/util/MockContainerListener.java
>  8fc0b9898ed4484b9db98da7f622ab61f35cd4b1 
>   
> samza-yarn/src/test/java/org/apache/samza/job/yarn/util/MockContainerRequestState.java
>  e7441e512e73b5d09740f252dc52efece640bea4 
>   
> samza-yarn/src/test/java/org/apache/samza/job/yarn/util/MockContainerUtil.java
>  4426ce6ac8808257dc00df50452df3d7555d6d0b 
>   
> samza-yarn/src/test/java/org/apache/samza/job/yarn/util/TestAMRMClientImpl.java
>  951e0f9504b87263eba891d367c9806248241c7d 
> 
> Diff: https://reviews.apache.org/r/43350/diff/
> 
> 
> Testing
> -------
> 
> Added 2 unit tests and I'm doing some manual testing in parallel with this 
> review. The manual test is to kill the NM for one of the containers of a job 
> that uses Host Affinity. When the RM detects the outage, it should notify the 
> AM which will try to restart the container on the same host. It will get a 
> connection error and at that point should retry on a DIFFERENT host.
> 
> 
> Thanks,
> 
> Jake Maes
> 
>

Reply via email to