Just some more info.

I use a ContinousMapper within a task that sends off these jobs. Can this
maybe be the reason?

On Fri, 30 Aug 2019 at 21:49, Pascoe Scholle <pascoescholletr...@gmail.com>
wrote:

> Hi Illya,
>
> So I have exciting news. Like you say we have to configure some of the
> nodes manually and I have done this using the Collision spi and it works
> great!
>
> I have nodes configured for more powerful pc's that also have more ram
> available and limit the number of jobs that can run in parallel, really
> awesome feature.
>
> I now have another issue.
>
> Say I have node A and B both with attribute "compute". So all jobs
> deployed on the cluster are only executed on these two nodes.
>
> Node A runs on a weaker machine so I limit the number of parallel jobs to
> 2.
>
> Node B runs on the stronger machine with 4 parallel jobs running at the
> same time.
>
> Say I send off 100 jobs to the cluster and these are evenly distributed
> between these two nodes. Node B will obviously finish its queue much faster.
>
> So I tried to implement a JobStealingCollision spi, but its not stealing
> any jobs from Node A.
>
> Here is some code to show the configuration, can you point out what is
> missing?
>
> This is the job stealer node that has job stealing enabled:
>
> """"""""""""""""""""""""""""""""""""""""""""""""""""""
>
>   val jobStealer = new JobStealingCollisionSpi();
>   jobStealer.setStealingEnabled(true);
>
>   jobStealer.setStealingAttributes(Map("compute.node" -> "true"));
>
>   /* set the number of parallel jobs allowed */
>   jobStealer.setActiveJobsThreshold(10);
>
>   // Configure stealing attempts number.
>   jobStealer.setMaximumStealingAttempts(10);
>
>   val failoverSpi = new JobStealingFailoverSpi();
>
>   val cfg = new IgniteConfiguration();
>
>   val cacheConfig = new CacheConfiguration("myCache");
>
>   /* partition the cache over the entire cluster */
>   cacheConfig.setCacheMode(CacheMode.PARTITIONED);
>
>   cfg.setCacheConfiguration(cacheConfig);
>
>   cfg.setMetricsUpdateFrequency(10);
>
>   cfg.setFailoverSpi(failoverSpi);
>
>   cfg.setPeerClassLoadingEnabled(true);
>
>   cfg.setCollisionSpi(jobStealer);
>
>   /* jobs are sent to this node for computation */
>   cfg.setUserAttributes(Map("compute.node" -> true));
>
> """"""""""""""""""""""""""""""""""""""""""""
>
> Next the configuration for Node B:
> job stealing is disabled, so that other nodes can take jobs from this node
>
> """""""""""""""""""""""""""""""
>   val jobStealer = new JobStealingCollisionSpi();
>
>   jobStealer.setStealingEnabled(false);
>
>   /* set the number of parallel jobs allowed */
>   jobStealer.setActiveJobsThreshold(2);
>
>   jobStealer.setStealingAttributes(Map("compute.node" -> "true"));
>
>   val failoverSpi = new JobStealingFailoverSpi();
>
>   val cacheConfig = new CacheConfiguration("myCache");
>
>   cacheConfig.setCacheMode(CacheMode.PARTITIONED);
>
>   cfg.setCacheConfiguration(cacheConfig);
>
>   cfg.setPeerClassLoadingEnabled(true);
>
>   cfg.setFailoverSpi(failoverSpi);
>
>   cfg.setCollisionSpi(jobStealer);
>
>   cfg.setUserAttributes(Map("compute.node" -> true));
>
> """"""""""""""""""""""""""""""""""""""""""
>
> I also have a third node reserved for running services with a different
> attribute.
>
> Any pointers on why the JobStealer node is not stealing any jobs?
>
> Thanks!
>
> On Fri, 30 Aug 2019 at 15:02, Ilya Kasnacheev <ilya.kasnach...@gmail.com>
> wrote:
>
>> Hello!
>>
>> I think you will have to do it semi-manually: on every node you should
>> know how many resources are available, don't allow jobs to overconsume.
>>
>> You could try to use LoadBalancingSpi but as far as I have heard, it is
>> not trivial.
>>
>> Ignite it not a scheduler, so we don't have any built-ins for resource
>> control.
>>
>> Regards,
>> --
>> Ilya Kasnacheev
>>
>>
>> чт, 29 авг. 2019 г. в 14:27, Pascoe Scholle <pascoescholletr...@gmail.com
>> >:
>>
>>> Hi,
>>>
>>> I have a question regarding the workings of the load balancer and
>>> running prcesses which are outside the jvm.
>>>
>>> We have a python commandline tool which is used for processing big data.
>>> The tool is highly optimized and is able to instantly load data into ram.
>>> Some data can be as large as 20 Gb.
>>>
>>> When sending multiple jobs each triggering their own python process, I
>>> do not want this to occur on the same machine, is there any way we can use
>>> the load balancer to ensure that all jobs are evenly distributed or
>>> possibly restrict certain jobs to a node running on a desktop we know has
>>> enough memory available.
>>>
>>> For example I have two machines one which has 32 Gb of ram and the
>>> second which has 64 Gb, one ignite node per machine. Three jobs are sent
>>> using ComputeTaskContinuousMapper. The 32Gb  machine received two tasks and
>>> obviously froze up.  Having some way of ensuring that two jobs are sent to
>>> the machine with more memory would be really helpful.
>>>
>>> Thanks and kind regards,
>>> Pascoe
>>>
>>

Reply via email to