> On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-core/src/main/java/org/apache/samza/job/model/JobModel.java, lines > > 65-72 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065672#file1065672line65> > > > > duplicated code with line 116 - 122
Ok. Fixed it. > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-core/src/main/scala/org/apache/samza/util/Util.scala, line 44 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065674#file1065674line44> > > > > not needed, 1) it is very short alreay 2) sometimes you are using > > clock, sometimes, calling System.currentTimeMillis, why not just calling > > System.currentTimeMillis anywhere? Sure. Makes sense. We can call System.currentTimeMillis everywhere. Although I think this statement in our coding guideline - "* Do not use sleep or other timing assumptions in tests, it is always, always, always wrong and will fail intermittently on any test server with other things going on that causes delays. Write tests in such a way that they are not timing dependent. Seriously. One thing that will help this is to never directly use the system clock in code (i.e. System.currentTimeMillis) but instead to use getTime: () => Long, so that time can be mocked.*" is pretty useful to have. I don't have a strong opinion because I don't think we are using it any of our unit tests (yet) and we might have to get rid of the scala way of doing things until we move to jdk 1.8. > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/AbstractContainerAllocator.java, > > line 35 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065676#file1065676line35> > > > > if we are forced to implement the Runnable anyway, why not implement > > the interface in this abstract class? Ah yeah. I should have just added it to the abstract class! > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/AbstractContainerAllocator.java, > > lines 67-68 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065676#file1065676line67> > > > > To be more precise, when run host-affinity-enabled job first time, the > > locality value is null too. Ok. Changed it. > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/AbstractContainerAllocator.java, > > line 89 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065676#file1065676line89> > > > > -> */ Changed it. > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerAllocator.java, > > line 37 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065677#file1065677line37> > > > > just follow the same code style: start the variable with a letter ok > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerAllocator.java, > > line 69 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065677#file1065677line69> > > > > maybe better to put the sleep after > > containerRequestState.releaseExtraContainers(), because there is no reason > > that we sleep at the very beginning. In the beginning, it is more likely that no container requests are submitted to the RM. It really doesn't matter where the sleep is going to be. I will move it to the end if it is easier to understand the logic. > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerAllocator.java, > > line 71 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065677#file1065677line71> > > > > getContainersOnAHost returns allocatedContainers, not > > availableContainers. So what is the difference beween allocatedContainer > > and available conainers? It should be allocatedContainers. There is no difference. This is a result of my refactoring. Thanks for raising this point. > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerAllocator.java, > > line 80 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065677#file1065677line80> > > > > this getId is for the global container Id, right? What do you mean global container Id? This will be the containerId assigned by Yarn > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerAllocator.java, > > line 84 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065677#file1065677line84> > > > > do not understand this comment. Why are there any extra containers? Do > > we request more containers than needed? Or this is only for host-enabled? Yeah. This is precisely the reason why this patch took so long :( Basically, container allocation in yarn is fully asynchronous, meaning it invokes "allocate" to the RM with every heartbeat. Each allocate request in the heartbeat has an up-to-date state of the container requests we have made. Since an allocated container is handled asynchronously in the AM Callback thread, it is possible that while allocated container is getting started, it will heartbeat to the RM again. RM thinks that the container request is still valid and assigns a new one. In order to avoid excessive container allocation for a single request, looks like the client (AM in this case) has to do the bookkeeping of the request state and remove any container request that has been satisfied. Unless the client explicitly remove the request from the local queue (I am referring to the queue maintained by the AMRMClient), the RM continues to assign containers. This is the case even today, because we do not remove requests from the local queue. Folks in Slider and Spark have also come across this behavior and changed their AM logic. It's time for Samza to catch-up :) Here are some relevant JIRAs for you to look at : https://issues.apache.org/jira/browse/SLIDER-828 https://issues.apache.org/jira/browse/SPARK-2687 https://issues.apache.org/jira/browse/YARN-1902 Let me know if you have more questions. > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerRequestState.java, > > line 41 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065679#file1065679line41> > > > > remove _ removed > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerAllocator.java, > > lines 88-89 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065677#file1065677line88> > > > > change to log.info(msg, e)? Ok > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerRequestState.java, > > line 42 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065679#file1065679line42> > > > > I think ANY_HOST = "ANY_HOST" is less confusing. Ok > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerRequestState.java, > > lines 44-47 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065679#file1065679line44> > > > > what is this comment for? It is redundant. Let me remove it. > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerRequestState.java, > > line 55 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065679#file1065679line55> > > > > naming to requestQueue maybe more descriptive. Leave it to you. Changed it. I was afraid to change it previously because it was used in a lot of places. Now there are sufficient tests. So, I can change it without fear of breaking anything :) > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerRequestState.java, > > line 61 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065679#file1065679line61> > > > > Maybe requests2StateMap/requests2CountMap ? just 2 cents. Definitely sounds better. I renamed it to requestsToCountMap. > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerRequestState.java, > > lines 96-99 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065679#file1065679line96> > > > > 1. format the comment a little. :) using > > /* > > * > > */ > > > > 2. this comment does not explain the code following it. Maybe this > > comment should be part of the allocatedContainers variable javadoc. Fixed the formatting. The comment was intented to explain why we update the allocatedContainers when only a containerRequest is made. So, I think it is better to leave it here. I have re-worded it to explain better. > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerRequestState.java, > > line 113 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065679#file1065679line113> > > > > just curious if it is possible that, the hostname of the machine is > > different from the before-":"-part of the HttpAddress? what do you mean by different? the format of HttpAddress is "$hostname:$port". We are only interested in the hostname here. > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerRequestState.java, > > line 116 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065679#file1065679line116> > > > > can line 116 to 163 be refactored a little? Now it looks very confusing > > to me. (Too many nested if) > > > > Is something like: > > > > if (requestCountOnThisHost > 0 && (allocatedContainersOnThisHost == > > null || allocatedContainersOnThisHost.size() < requestCountOnThisHost)) > > > > addToAllocatedContainerList(hostName, container); > > > > } > > else { > > addToAllocatedContainerList(ANY_HOST, container); > > } > > > > sufficient for the logic? Yeah. It should be sufficient. This is how I had it before. It became hard to debug because you didn't why a container was allocated to a buffer. That's why I simplified the logic. I think for now it adds value to have detailed logging until we can make sure that the feature is stable. Do you think it is ok? > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerRequestState.java, > > lines 135-137 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065679#file1065679line135> > > > > nit: follow the block comments format when the comments can not be fit > > in one line. :) > > > > /* > > * > > */ Ok > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerRequestState.java, > > line 199 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065679#file1065679line199> > > > > you mentioned "buffer" in a few places, which variable do you refer to? buffer is how I would like semantically refer to it. As for variable name, it is still of a list type :) "buffer" refers to the list of allocated containers maintained in the allocatedContainers map. > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerRequestState.java, > > lines 231-232 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065679#file1065679line231> > > > > why is this logic not in the hostAffinityEnabled clause? Hm.. This logic should apply for hostAffinityEnabled clause as well. I think this was a result of the refactoring. And looks like all it does is logging. I will actually remove this as it doesn't serve much purpose. > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerRequestState.java, > > lines 267-273 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065679#file1065679line267> > > > > why not just use allocatedContainers.get(host); :) yes. You are right. I think this stems from before the time I added the fix in Line 96-99 for correlating data and null checks in the run loop. > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerRequestState.java, > > line 280 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065679#file1065679line280> > > > > this seems not safe to me. why? how do you suggest I can change it? > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerUtil.java, line > > 51 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065680#file1065680line51> > > > > no _ Done! > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerUtil.java, > > lines 79-80 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065680#file1065680line79> > > > > it is worth differenciating the containerId and container.getId Ok. I will rename containerId to samzaContainerId.. Does that sound ok? > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerUtil.java, line > > 178 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065680#file1065680line178> > > > > wrap it in the log or the exception? Because if users are using the > > StreamAppender, they may lose those information. Got it. > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerUtil.java, line > > 196 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065680#file1065680line196> > > > > same: wrap it Done > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerUtil.java, line > > 199 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065680#file1065680line199> > > > > same: wrap it Done > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/HostAwareContainerAllocator.java, > > line 41 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065681#file1065681line41> > > > > no _ Done > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/HostAwareContainerAllocator.java, > > lines 60-68 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065681#file1065681line60> > > > > actually both HostAwareContainerAllocator and ContainerAllocator share > > the same logic: if value is null, use ANY_HOST. We may consider to abstract > > them to AbstractContainerAllocator. my 0.02. I think I separated them to make sure that if a user switches to host-affinity and then, decides to disable it, host-affinity mapping will still be non-null. I don't think it affects the behavior or performance. Let me see if I can merge it into the abstract class. > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/HostAwareContainerAllocator.java, > > line 72 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065681#file1065681line72> > > > > remove empty line done > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/HostAwareContainerAllocator.java, > > line 77 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065681#file1065681line77> > > > > you mean, ANY or ANY_HOST? changed it > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/HostAwareContainerAllocator.java, > > line 84 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065681#file1065681line84> > > > > the same, move to the end of the try clause? Done > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/HostAwareContainerAllocator.java, > > line 86 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065681#file1065681line86> > > > > is it possible to refactor this part a little to make it more readable? > > Currently too many nested if statement. Many conditions can be combined. As before, nested ifs enable us to log better. In this case, the action to be take on each conditional clause is not overlapping either. Let me give refactoring a try. I think I have sufficient unit tests to verify all changes. > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/HostAwareContainerAllocator.java, > > line 96 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065681#file1065681line96> > > > > same confusion. getContainersOnAHost gives the allocated containers, > > not avaialble containers, right? > > > > I guess getContainersOnAHost should talk to the RM to negotiate the > > possible resource? Sorry about the confusion. getContainersOnAHost is just an accessor to the state we maintain in ContainerRequestState. We try to buffer all containers allocated by the RM in a map keyed by the hostname. > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/HostAwareContainerAllocator.java, > > line 142 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065681#file1065681line142> > > > > do we define how many times we want to try before fail the whole job? > > Because if there is not sufficient resource, it is meaningless to keep > > trying the request. We don't keep track of how many times to we request. If we request for host, then we wait for CONTAINER_REQUEST_TIMEOUT ms to satisfy the request. If the requested host is not allocated within that timeout, we choose from ANY_HOST. If there is insufficient resources in the cluster, then the behavior is same as today. Job does not fully start. I think we should handle this issue. However it warrants a separate JIRA. > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/HostAwareContainerAllocator.java, > > line 158 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065681#file1065681line158> > > > > warp it done > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/HostAwareContainerAllocator.java, > > line 161 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065681#file1065681line161> > > > > wrap it done > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaAppState.java, line > > 42 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065682#file1065682line42> > > > > -> */ done > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaAppState.java, line > > 50 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065682#file1065682line50> > > > > -> */ done > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaAppState.java, line > > 56 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065682#file1065682line56> > > > > -> */ done > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaAppState.java, line > > 57 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065682#file1065682line57> > > > > if this is for the AM container id, maybe amContainerId makes more > > sense? Yep > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaAppState.java, line > > 115 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065682#file1065682line115> > > > > to */ done > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaAppState.java, line > > 120 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065682#file1065682line120> > > > > to */ done > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaAppState.java, line > > 130 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065682#file1065682line130> > > > > to */ done > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaAppState.java, line > > 140 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065682#file1065682line140> > > > > since now we have two different containerId, one is globally unique, > > ContainerStatus.getContainerId, and the other one is Samza's containerId > > (based on the container count). Maybe worth differenciating the name. It > > confused me a little. You are right. The existing nomenclature is the same and confusing. I have tried to change the variable names to make it more obvious. I hope it is better after refactoring. > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaContainerRequest.java, > > lines 39-42 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065683#file1065683line39> > > > > format the comment to > > /* > > * > > */ done > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaContainerRequest.java, > > lines 55-64 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065683#file1065683line55> > > > > combine them into one if statement? yes > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaContainerRequest.java, > > lines 69-71 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065683#file1065683line69> > > > > I think the defaul Core, Memo, Priority should not be here because they > > already are in the YarnConfig. So we may want to pass those values from > > AbstractContainerAllocator, instead of having them here. This was just a handy construtor. Are you suggesting that I move the default definition to AbstractContainerAllocator and refer it in SamzaContainerRequest ? Let me know if I understood your comment correctly. > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaTaskManager.java, > > line 51 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065684#file1065684line51> > > > > where is the callback handler? Is it the SamzaTaskManager ? SamzaAppMaster is the callback handler because it extends the AMRMClientAsync.CallbackHandler interface. Even though the AppMaster clas is registered as callback handler implementor in Line 84 of SamzaAppMaster.scala, for the sake of clarity, the callbacks are delegated to methods in the SamzaTaskManager. > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaTaskManager.java, > > line 58 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065684#file1065684line58> > > > > -> */ done > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaTaskManager.java, > > line 61 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065684#file1065684line61> > > > > remove _ done > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaTaskManager.java, > > line 85 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065684#file1065684line85> > > > > we usually have the config in a separate class. Maybe put it into the > > YarnConfig? Makes sense. > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaTaskManager.java, > > line 118 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065684#file1065684line118> > > > > TODO ok > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaTaskManager.java, > > line 139 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065684#file1065684line139> > > > > wrap it. ok > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaTaskManager.java, > > line 152 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065684#file1065684line152> > > > > -> */ ok > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaTaskManager.java, > > line 154 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065684#file1065684line154> > > > > javadoc will help understand this method ok > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaTaskManager.java, > > lines 157-161 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065684#file1065684line157> > > > > can this be simplified as > > if > > (state.runningContainers.containsValue(containerStatus.getContainerId()), > > then ... Yeah. but we are interested in the key for that entry. If you use an if-condition to check whether the value is present, how will you return the key? You won't have any handle/reference to the map entry. Also, we are not directly comparing the value. We are checking the value of a field in the entry's value. Hence, if statement is insufficient. > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaTaskManager.java, > > line 174 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065684#file1065684line174> > > > > why is the containerId in the containerFailures list by default? It is > > possible that this containerId not in the list, right? I think the variable name is misleading. containerFailures is a map of samzaContainerId -> ContainerFailure instance. ContainerFailure instance encapsulates the last failed time and number of time a samzaContainerId failed. It is possible that the containerId is not in the list. In that case, remove acts like a no-op. > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaTaskManager.java, > > lines 183-185 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065684#file1065684line183> > > > > from this logic, even the Container fails because of DISKS_FAILED or > > ABORTED, we will still have the following log.info and do the line 186-205. > > Is it supposed to be this? Yes. The behavior is supposed to be the same. I guess the log.info can use a little refactoring. It doesn't accommodate the yarn 2.6 ContainerExitStatus that we have added here. > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaTaskManager.java, > > line 186 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065684#file1065684line186> > > > > use the PREEMPTED instead of -100? Made the log statement generic. Not sure how to get the name of the ExitStatus printed without maintaining a mapping. The ContainerExitStatus is not an enum, but a class with static variables. > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaTaskManager.java, > > line 244 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065684#file1065684line244> > > > > why not just give the currentFailCount and lastFailureTime default > > values? Then we can get rid of the "else" part. looks simpler. Yeah. But that means we have to initialize the containerFailures map with each containerId and a default ContainerFailure instance. I think having an else statement is simpler. > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaTaskManager.java, > > line 245 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065684#file1065684line245> > > > > should be 1. because it is used in line 270 right. > On Sept. 11, 2015, 1:47 a.m., Yan Fang wrote: > > samza-yarn/src/main/scala/org/apache/samza/job/yarn/YarnJob.scala, line 74 > > <https://reviews.apache.org/r/37817/diff/5/?file=1065693#file1065693line74> > > > > it should be amJavaHome.isEmpty() , because we set "" as the default > > value in YarnConfig. We are setting default as "null" in YarnConfig in Line 149, aren't we ? - Navina ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/37817/#review98316 ----------------------------------------------------------- On Sept. 9, 2015, 12:28 a.m., Navina Ramesh wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/37817/ > ----------------------------------------------------------- > > (Updated Sept. 9, 2015, 12:28 a.m.) > > > Review request for samza, Yan Fang, Chinmay Soman, Chris Riccomini, and Yi > Pan (Data Infrastructure). > > > Bugs: SAMZA-619 > https://issues.apache.org/jira/browse/SAMZA-619 > > > Repository: samza > > > Description > ------- > > This a major change to how we request and assign containers to resources. It > uses a threaded model for request and allocation. More comments in the > javadoc. > > Major changes to note: > 1. Changed yarn dependency to 2.6 > 2. Moved YarnConfig to java config class > 2. Removed SamzaAppMasterTaskManager > 3. SamzaAppState replaces SamzaAppMasterState > 4. SamzaTaskManager is very similar to SamzaAppMasterTaskManager, except that > it delegates onContainerAllocated and requestContainers to the thread in > ContainerAllocator > 5. Removed state.unclaimedContainers > 6. Allocator thread sleep time and request timeout is configurable > 7. host-affinity can be enabled by using the config > "yarn.samza.host-affinity.enabled". Host affinity is guaranteed to work with > FairScheduler that has continuous scheduling enabled. Details on this config > can be found at SAMZA-617 > [design](https://issues.apache.org/jira/secure/attachment/12726945/DESIGN-SAMZA-617-2.pdf). > 8. Added unit tests for TaskManager & ContainerAllocator > 9. Updated config documentation > > Pending items: > 1. Update web-site with info on this feature (SAMZA-668) > > > Diffs > ----- > > build.gradle 3a7fabcd4f4e5c8db950c3a33bba33618c5565b4 > docs/learn/documentation/versioned/jobs/configuration-table.html > c23d7d37f151a0dbdd71f52588773bc67edf88c8 > gradle/dependency-versions.gradle 36d564b6ca895f042ee4802643e49180f4947b62 > samza-core/src/main/java/org/apache/samza/container/LocalityManager.java > c567bf4a1747a7a6aff23be74cf25b1a44e57b42 > > samza-core/src/main/java/org/apache/samza/coordinator/stream/messages/SetContainerHostMapping.java > 5455881e2a4a9c865137f2708f655641b93c2bfb > samza-core/src/main/java/org/apache/samza/job/model/JobModel.java > 7b592740c082225fc217a24188bc874f883db63b > samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala > 61e228b1ff8619da4ad2e11b15a5afb7999c4996 > samza-core/src/main/scala/org/apache/samza/util/Util.scala > c8438bd6ed01a95860867505d84c04c493a5a197 > samza-yarn/src/main/java/org/apache/samza/config/YarnConfig.java > PRE-CREATION > > samza-yarn/src/main/java/org/apache/samza/job/yarn/AbstractContainerAllocator.java > PRE-CREATION > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerAllocator.java > PRE-CREATION > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerFailure.java > PRE-CREATION > > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerRequestState.java > PRE-CREATION > samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerUtil.java > PRE-CREATION > > samza-yarn/src/main/java/org/apache/samza/job/yarn/HostAwareContainerAllocator.java > PRE-CREATION > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaAppState.java > PRE-CREATION > > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaContainerRequest.java > PRE-CREATION > samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaTaskManager.java > PRE-CREATION > samza-yarn/src/main/resources/scalate/WEB-INF/views/index.scaml > ce2145adde09b0e8097a1c711bc296ee20392e63 > samza-yarn/src/main/scala/org/apache/samza/config/YarnConfig.scala > 1f51551893c42bb13d7941c8b0e5594ac7f42585 > samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMaster.scala > af42c6a6636953a95f79837fe372e0dbd735df70 > > samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterLifecycle.scala > d475c4c566f17f88a04db5fbc84cc2e27eb333d2 > > samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterMetrics.scala > 03acfe1bbbabf8f54be9f36fdae785476da45135 > > samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterService.scala > 060538623e4d67b986bc635518e7fe8ebdde9e24 > > samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterState.scala > f667c83a7fff43a5efdc64cde019b2cf35f38cb9 > > samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterTaskManager.scala > 1743c8611a94fa1c7f7dafd8ff8c713039f3df8c > samza-yarn/src/main/scala/org/apache/samza/job/yarn/YarnJob.scala > 8dd70c9977473e083e463d01b049c40e15b21f4a > > samza-yarn/src/main/scala/org/apache/samza/webapp/ApplicationMasterRestServlet.scala > 09f4dc32a4b18aeb3accb856c360ea2f95c82673 > > samza-yarn/src/main/scala/org/apache/samza/webapp/ApplicationMasterWebServlet.scala > 7fd51221149ba59e7ea1bc0b51d58726ec17a7d7 > > samza-yarn/src/test/java/org/apache/samza/job/yarn/TestContainerAllocator.java > PRE-CREATION > > samza-yarn/src/test/java/org/apache/samza/job/yarn/TestContainerRequestState.java > PRE-CREATION > > samza-yarn/src/test/java/org/apache/samza/job/yarn/TestHostAwareContainerAllocator.java > PRE-CREATION > > samza-yarn/src/test/java/org/apache/samza/job/yarn/TestSamzaContainerRequest.java > PRE-CREATION > > samza-yarn/src/test/java/org/apache/samza/job/yarn/TestSamzaTaskManager.java > PRE-CREATION > > samza-yarn/src/test/java/org/apache/samza/job/yarn/util/MockContainerAllocator.java > PRE-CREATION > > samza-yarn/src/test/java/org/apache/samza/job/yarn/util/MockContainerUtil.java > PRE-CREATION > samza-yarn/src/test/java/org/apache/samza/job/yarn/util/MockHttpServer.java > PRE-CREATION > samza-yarn/src/test/java/org/apache/samza/job/yarn/util/MockNMClient.java > PRE-CREATION > > samza-yarn/src/test/java/org/apache/samza/job/yarn/util/TestAMRMClientImpl.java > PRE-CREATION > samza-yarn/src/test/java/org/apache/samza/job/yarn/util/TestUtil.java > PRE-CREATION > > samza-yarn/src/test/scala/org/apache/samza/job/yarn/TestSamzaAppMasterLifecycle.scala > df5992e659302d2918c4e2c30b6122ed51ab9fe8 > > samza-yarn/src/test/scala/org/apache/samza/job/yarn/TestSamzaAppMasterService.scala > 6f4bfaf916d687426f746fd3b09cd56490bb500e > > samza-yarn/src/test/scala/org/apache/samza/job/yarn/TestSamzaAppMasterTaskManager.scala > 2eec65f02826de40493925c08ff344a8cc4feecb > > Diff: https://reviews.apache.org/r/37817/diff/ > > > Testing > ------- > > Tested with hello-samza locally. Tested with a sample job on a 3 node Yarn > cluster. Works as expected. > > > Thanks, > > Navina Ramesh > >