[ 
https://issues.apache.org/jira/browse/IGNITE-1267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14958749#comment-14958749
 ] 

Mark Howard commented on IGNITE-1267:
-------------------------------------

We've also hit this problem. I think it's broader than the title suggests 
though - it's for any node that is not in the original topology, not just new 
nodes. 

In our case, we're using ignite for relatively long jobs with a very small 
fanout - most tasks map to a single job, on a cluster of perhaps 100 nodes. Due 
to the topology restrictions in the collision and failover SPIs, these can 
never be stolen either by a new or existing idle node. 

The fix is relatively easy for us - comment out the topology checks in the job 
stealing collision and failover SPIs. This is valid for us since our initial 
load balancing is relatively straightforward, based on node attributes and the 
same node attributes are used in the job stealing configuration. It may not be 
entirely generic though since it's not as powerful as the original TopologySpi 
which was in early versions of gridgain. Without it though the collision SPIs 
are pretty much useless as they stand in the 1.4 release.. (unless we've missed 
something!)

> JobStealingCollisionSpi never sends jobs to a node that joined after task was 
> executed
> --------------------------------------------------------------------------------------
>
>                 Key: IGNITE-1267
>                 URL: https://issues.apache.org/jira/browse/IGNITE-1267
>             Project: Ignite
>          Issue Type: Bug
>          Components: compute
>    Affects Versions: 1.1.4
>            Reporter: Valentin Kulichenko
>              Labels: user-request
>
> Corresponding user thread (contains detailed description of the scenario that 
> doesn't work): 
> http://apache-ignite-users.70518.x6.nabble.com/Dynamic-ComputeTask-distribution-with-new-nodes-td997.html
> Essentially, {{JobStealingCollisionSpi}} always skips jobs that are not in 
> task topology (see line 713). Task topology is static and created when task 
> is executed, so newly joined node can't steal jobs. I think it should be able 
> to do this if it satisfies initial cluster group predicate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to