> On Oct. 5, 2016, 6:18 p.m., Jagadish Venkatraman wrote:
> > Overall, the patch looks great! This is exciting given that Samza can 
> > support scheduling based on tags. For example, jobs with rocksdb can be 
> > assigned to nodes with SSDs.
> > 
> > 
> > Can you please add some detail on testing this feature? 
> > What was the label setup of the cluster? (for example: Did we use an 
> > exclusive node label?), How many node labels? How many containers were 
> > requested for the job?
> 
> Maxim Logvinenko wrote:
>     We haven't tested it in production, but the main idea is the next: we 
> have 3 different types of nodes in our hadoop cluster. The first type is used 
> for ApplicationMasters (actually, we put up to 4 AM containers on one node). 
> The second type is used for stateless jobs and this type of nodes has a small 
> amount of memory. And the last type is used for stateful jobs and has more 
> memory than others. So, there are 3 labels in our cluster: taskam, 
> tasklowmem, taskhighmem. Now we force YARN to put containers on a particular 
> type of nodes by a small trick with resources (we chose resources for node in 
> such a way that YARN doesn't have any other variants except only one type of 
> nodes). But hadoop labels is a more natural way to request containers to be 
> placed on a specific node's type.

So, in this case, do you not care about *host affinity* at all when the job 
re-starts? Are you okay with your container coming back up on a different host 
(as long as it is a host with label `taskHighMem`)? We should make it explicit 
that when host-affinity.enabled=true, then node labelling will be ignored. Is 
my understanding reasonable?


- Jagadish


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51633/#review151519
-----------------------------------------------------------


On Oct. 7, 2016, 12:08 a.m., Maxim Logvinenko wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51633/
> -----------------------------------------------------------
> 
> (Updated Oct. 7, 2016, 12:08 a.m.)
> 
> 
> Review request for samza.
> 
> 
> Bugs: SAMZA-1013
>     https://issues.apache.org/jira/browse/SAMZA-1013
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> YARN Node labels were introduced in Hadoop version 2.6, which allows to group 
> nodes with similar characteristics and allows applications to specify where 
> to run. This patch adds support for YARN node labels in Samza.
> 
> In this implementation, node labels are defined directly in yarnConfig in 
> YarnClusterResourceManager. It might be better to have node labels as a part 
> of SamzaResourceRequest and SamzaResource classes, but 
> org.apache.hadoop.yarn.api.records.Container class doesn't contain node label 
> and hence we have nothing to pass to the SamzaResource constructor in 
> onContainersAllocated method of YarnClusterResourceManager class.
> 
> 
> Diffs
> -----
> 
>   samza-yarn/src/main/java/org/apache/samza/config/YarnConfig.java 8f2dc48 
>   
> samza-yarn/src/main/java/org/apache/samza/job/yarn/YarnClusterResourceManager.java
>  96d3d7c 
>   samza-yarn/src/main/scala/org/apache/samza/job/yarn/ClientHelper.scala 
> 0998c43 
> 
> Diff: https://reviews.apache.org/r/51633/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Maxim Logvinenko
> 
>

Reply via email to