Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

Reuti Tue, 12 Oct 2010 12:13:11 -0400

Am 12.10.2010 um 15:49 schrieb Dave Love:

> Chris Jewell <chris.jew...@warwick.ac.uk> writes:
> 
>> I've scrapped this system now in favour of the new SGE core binding feature.
> 
> How does that work, exactly?  I thought the OMPI SGE integration didn't
> support core binding, but good if it does.


With the default binding_instance set to "set" (the default) the shepherd 
should bind the processes to cores already. With other types of 
binding_instance these selected cores must be forward to the application via an 
environment variable or in the hostfile.

As this is only a hint to SGE and not a hard request, the user must plan a 
little bit the allocation beforehand. Especially if you oversubscribe a machine 
it won't work. When I look at /proc/*/status it's mentioned there as it 
happened. And it's also noted in "config" file of each job's 
.../active_jobs/... file. E.g. a top shows:

 9926 ms04      39  19  3756  292  228 R   25  0.0   0:19.31 ever               
                                                                                
                               
 9927 ms04      39  19  3756  292  228 R   25  0.0   0:19.31 ever               
                                                                                
                               
 9925 ms04      39  19  3756  288  228 R   25  0.0   0:19.30 ever               
                                                                                
                               
 9928 ms04      39  19  3756  292  228 R   25  0.0   0:19.30 ever 

for 4 forks of an endless loop in one and the same jobscript when submitted 
with `qsub -binding linear:1 demo.sh`. Well, the funny thing is that with this 
kernel version I still get a load of 4, despite the fact that all 4 forks are 
bound to one core. Should it really be four?

-- Reuti

> ______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

Reply via email to