Re: Job Stealing node not stealing jobs

Ilya Kasnacheev Mon, 23 Sep 2019 08:50:09 -0700

Hello!

Can you throw together a reproducer project that demonstrates incorrect
behavior? We would look into it, raise ticket if needed.


Thanks?


-- 
Ilya Kasnacheev


вт, 10 сент. 2019 г. в 13:01, Pascoe Scholle <[email protected]>:

> Thanks for the prompt response. I have looked the
> WeightedRandomLoadBalancingSpi. It does not look like one can set the
> number of parallel jobs though and this is big requirement. Also, it is
> inevitable that there will be nodes which will sit idle, due to the nature
> of jobs that will be deployed on the nodes and the job stealer just seems
> like the perfect solution. Regardless, I have used the code provided for
> the job stealing spi on the docs page and it isnt functioning as intended.
>
>
> On Tue, 10 Sep 2019 at 11:34, Stephen Darlington <
> [email protected]> wrote:
>
>> I don’t know the answer to your jon stealing question, but I do wonder if
>> that’s the right configuration for your requirements. Why not use the
>> weighted load balancer (
>> https://apacheignite.readme.io/docs/load-balancing)? That’s designed to
>> work in cases where nodes are of differing sizes.
>>
>> Regards,
>> Stephen
>>
>> On 10 Sep 2019, at 10:19, Pascoe Scholle <[email protected]>
>> wrote:
>>
>> Hello,
>>
>> is there any update on this?
>>
>> We have not been able to resolve this issue
>>
>> Kind regards
>>
>>
>> On Wed, 04 Sep 2019 at 07:44, Pascoe Scholle <
>> [email protected]> wrote:
>>
>>> Hi,
>>>
>>> attached a small scala project. Just set the build path to src after
>>> building and compiling with sbt.
>>>
>>> We want to execute processes that happen outside the JVM. These
>>> processes can be extremely memory intensive which is why I am limiting the
>>> number of parallel jobs that can be executed on a machine.
>>>
>>> I have one desktop that has a lot more memory available and can thus
>>> execute more jobs in parallel. As all jobs take roughly the same amount of
>>> time, this machine will have completed its jobs much faster. I want it to
>>> then take jobs from the nodes started on weaker machines once it has
>>> completed all its tasks.
>>>
>>> Does that make sense?
>>>
>>> Hope this helps.
>>>
>>> BR,
>>> Pascoe
>>>
>>> On Tue, 3 Sep 2019 at 17:29, Andrei Aleksandrov <[email protected]>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> Some remarks about job stealing SPI:
>>>>
>>>> 1)You have some nodes that can proceed the tasks of some compute job.
>>>> 2)Tasks will be executed in public thread pool by default:
>>>> https://apacheignite.readme.io/docs/thread-pools#section-public-pool
>>>> 3)If some node thread pool is busy then some task of compute job can be
>>>> executed on other node.
>>>>
>>>> In next cases it will not work:
>>>>
>>>> 1)In case if you choose specific node for your compute task
>>>> 2)In case if you do affinity call (the same as above but node will be
>>>> choose by affinity mapping)
>>>>
>>>> According to your case:
>>>>
>>>> It's not clear for me what exactly you try to do. Possible job stealing
>>>> didn't work because of your weak node began executions of some tasks in
>>>> public pool but just do it longer then faster one.
>>>>
>>>> Could you please share your full reproducer for investigation?
>>>>
>>>> BR,
>>>> Andrei
>>>>
>>>> 9/3/2019 1:43 PM, Pascoe Scholle пишет:
>>>> > HI there,
>>>> >
>>>> > I have asked this question, however I asked it under a different and
>>>> > resolved topic, so I posted the quest under a more suitable title. I
>>>> > hope thats ok
>>>> >
>>>> > We have tried to configure two compute server nodes one of which is
>>>> > running on a weaker machine. The node running on the more powerful
>>>> > machine always finished its tasks far before
>>>> > the weaker node and then sits idle.
>>>> >
>>>> > The node is not even sending a steal request, so I must have
>>>> > configured something wrong.
>>>> >
>>>> > I have attached the code for both nodes if you could kindly point out
>>>> > what I am missing , I would really appreciate it!
>>>> >
>>>> >
>>>>
>>>
>>
>>

Re: Job Stealing node not stealing jobs

Reply via email to