Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

Reuti Tue, 16 Nov 2010 06:08:07 -0500

Am 16.11.2010 um 10:26 schrieb Chris Jewell:

> Hi all,
> 
>> On 11/15/2010 02:11 PM, Reuti wrote: 
>>> Just to give my understanding of the problem: 
>>>> 
>>>>>> Sorry, I am still trying to grok all your email as what the problem you 
>>>>>> are trying to solve. So is the issue is trying to have two jobs having 
>>>>>> processes on the same node be able to bind there processes on different 
>>>>>> resources. Like core 1 for the first job and core 2 and 3 for the 2nd 
>>>>>> job? 
>>>>>> 
>>>>>> --td 
>> You can't get 2 slots on a machine, as it's limited by the core count to one 
>> here, so such a slot allocation shouldn't occur at all. 
> 
> So to clarify, the current -binding <binding_strategy>:<binding_amount> 
> allocates binding_amount cores to each sge_shepherd process associated with a 
> job_id.  There appears to be only one sge_shepherd process per job_id per 
> execution node, so all child processes run on these allocated cores.  This is 
> irrespective of the number of slots allocated to the node.  
> 
> I agree with Reuti that the binding_amount parameter should be a maximum 
> number of bound cores per node, with the actual number determined by the 
> number of slots allocated per node.  FWIW, an alternative approach might be 
> to have another binding_type ('slot', say) that automatically allocated one 
> core per slot.
> 
> Of course, a complex situation might arise if a user submits a combined 
> MPI/multithreaded job, but then I guess we're into the realm of setting 
> allocation_rule.


IIRC there was a discussion on the [GE users] list about it, to get an uniform 
distribution on all slave nodes for such jobs, as also e.g. $OMP_NUM_THREADS 
will be set to the same value for all slave nodes for hybrid jobs. Otherwise it 
would be necessary to adjust SGE to set this value in the "-builtin-" startup 
method automatically on all nodes to the local granted slots value. For now a 
fixed allocation rule of 1,2,4 or whatever must be used and you have to submit 
by reqeusting a wildcard PE to get any of these defined PEs for an even 
distribution and you don't care whether it's two times two slots, one time four 
slots, or four times one slot.

In my understanding, any type of parallel job should always request and get the 
total number of slots equal to the cores it needs to execute. Independent 
whether these are threads, forks or any hybrid type of jobs. Otherwise any 
resource planing and reservation will most likely fail. Nevertheless, there 
might exist rare cases where you submit an exclusive serial job but create 
threads/forks in the end. But such a setup should be an exception, not the 
default.


> Is it going to be worth looking at creating a patch for this?

Absolute.


>  I don't know much of the internals of SGE -- would it be hard work to do?  
> I've not that much time to dedicate towards it, but I could put some effort 
> in if necessary...

I don't know about the exact coding for it, but when it's for now a plain 
"copy" of the binding list, then it should become a loop to create a list of 
cores from the original specification until all granted slots got a core 
allocated.

-- Reuti


> 
> Chris
> 
> 
> --
> Dr Chris Jewell
> Department of Statistics
> University of Warwick
> Coventry
> CV4 7AL
> UK
> Tel: +44 (0)24 7615 0778
> 
> 
> 
> 
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

Reply via email to