> On Oct. 5, 2016, 6:18 p.m., Jagadish Venkatraman wrote:
> > Overall, the patch looks great! This is exciting given that Samza can 
> > support scheduling based on tags. For example, jobs with rocksdb can be 
> > assigned to nodes with SSDs.
> > 
> > 
> > Can you please add some detail on testing this feature? 
> > What was the label setup of the cluster? (for example: Did we use an 
> > exclusive node label?), How many node labels? How many containers were 
> > requested for the job?
> 
> Maxim Logvinenko wrote:
>     We haven't tested it in production, but the main idea is the next: we 
> have 3 different types of nodes in our hadoop cluster. The first type is used 
> for ApplicationMasters (actually, we put up to 4 AM containers on one node). 
> The second type is used for stateless jobs and this type of nodes has a small 
> amount of memory. And the last type is used for stateful jobs and has more 
> memory than others. So, there are 3 labels in our cluster: taskam, 
> tasklowmem, taskhighmem. Now we force YARN to put containers on a particular 
> type of nodes by a small trick with resources (we chose resources for node in 
> such a way that YARN doesn't have any other variants except only one type of 
> nodes). But hadoop labels is a more natural way to request containers to be 
> placed on a specific node's type.
> 
> Jagadish Venkatraman wrote:
>     So, in this case, do you not care about *host affinity* at all when the 
> job re-starts? Are you okay with your container coming back up on a different 
> host (as long as it is a host with label `taskHighMem`)? We should make it 
> explicit that when host-affinity.enabled=true, then node labelling will be 
> ignored. Is my understanding reasonable?

Seems like hadoop has bug (or feature, don't know how to call that). Node label 
expression is ignored if preferred host != "ANY". So, if we run samza job for 
the first time it has no preferred host, and hence it will use label for 
resource request. Each consequent resource request for this container will use 
preferred host and put container on the same node as before. But if this node 
is failed (or not reachable by any other reason) samza will still send 
preferred host in resource request but hadoop can allocate this resource on any 
node which will fit <vcores, memory> conditions. Need invesigate it more to 
answer this question.


- Maxim


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51633/#review151519
-----------------------------------------------------------


On Oct. 7, 2016, 12:08 a.m., Maxim Logvinenko wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51633/
> -----------------------------------------------------------
> 
> (Updated Oct. 7, 2016, 12:08 a.m.)
> 
> 
> Review request for samza.
> 
> 
> Bugs: SAMZA-1013
>     https://issues.apache.org/jira/browse/SAMZA-1013
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> YARN Node labels were introduced in Hadoop version 2.6, which allows to group 
> nodes with similar characteristics and allows applications to specify where 
> to run. This patch adds support for YARN node labels in Samza.
> 
> In this implementation, node labels are defined directly in yarnConfig in 
> YarnClusterResourceManager. It might be better to have node labels as a part 
> of SamzaResourceRequest and SamzaResource classes, but 
> org.apache.hadoop.yarn.api.records.Container class doesn't contain node label 
> and hence we have nothing to pass to the SamzaResource constructor in 
> onContainersAllocated method of YarnClusterResourceManager class.
> 
> 
> Diffs
> -----
> 
>   samza-yarn/src/main/java/org/apache/samza/config/YarnConfig.java 8f2dc48 
>   
> samza-yarn/src/main/java/org/apache/samza/job/yarn/YarnClusterResourceManager.java
>  96d3d7c 
>   samza-yarn/src/main/scala/org/apache/samza/job/yarn/ClientHelper.scala 
> 0998c43 
> 
> Diff: https://reviews.apache.org/r/51633/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Maxim Logvinenko
> 
>

Reply via email to