Thank you for your help Ralph and Reuti,

The problem turned out to be the number of file descriptors was insufficient.

The reason given by a sys admin was that since SGE isn't a user it wasn't 
initially using the new
upper bound on the number of file descriptors.

-Bill Lane




________________________________________
From: users-boun...@open-mpi.org [users-boun...@open-mpi.org] on behalf of 
Ralph Castain [r...@open-mpi.org]
Sent: Tuesday, July 26, 2011 1:22 PM
To: Open MPI Users
Subject: Re: [OMPI users] Can run OpenMPI testcode on 86 or fewer slots in      
cluster, but nothing more than that

On Jul 26, 2011, at 1:58 PM, Reuti wrote:
>>>> allocation_rule    $fill_up
>>>
>>> Here you specify to fill one machine after the other completely before 
>>> gathering slots from the next machine. You can change this to $round_robin 
>>> to get one slot form each node before taking a second from particular 
>>> machines. If you prefer a fixed allocation, you could also put an integer 
>>> here.
>>
>> Remember, OMPI only uses sge to launch one daemon/node. The placement of MPI 
>> procs is totally up to mpirun itself, which doesn't look at any SGE envar.
>
> I thought this is the purpose to use --with-sge during configure as you don't 
> have to provide any hostlist at all and Open MPI will honor it by reading SGE 
> envars to get the granted slots?
>

We use the envars to determine how many slots were allocated, but not the 
placement. So we'll look at the envar to get the number of slots allocated on 
each node, but we then determine the layout of processes against the slots. To 
the point, we don't look at an sge envar to determine how that layout is to be 
done.

I was only trying to point out the difference. I admit it can be confusing when 
using sge, especially since sge doesn't actually have visibility into the MPI 
procs themselves (i.e., the only processes launched by sge are the daemons).



_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
IMPORTANT WARNING: This message is intended for the use of the person or entity 
to which it is addressed and may contain information that is privileged and 
confidential, the disclosure of which is governed by applicable law. If the 
reader of this message is not the intended recipient, or the employee or agent 
responsible for delivering it to the intended recipient, you are hereby 
notified that any dissemination, distribution or copying of this information is 
STRICTLY PROHIBITED. If you have received this message in error, please notify 
us immediately by calling (310) 423-6428 and destroy the related message. Thank 
You for your cooperation.
IMPORTANT WARNING:  This message is intended for the use of the person or 
entity to which it is addressed and may contain information that is privileged 
and confidential, the disclosure of which is governed by
applicable law.  If the reader of this message is not the intended recipient, 
or the employee or agent responsible for delivering it to the intended 
recipient, you are hereby notified that any dissemination, distribution or 
copying of this information is STRICTLY PROHIBITED.

If you have received this message in error, please notify us immediately
by calling (310) 423-6428 and destroy the related message.  Thank You for your 
cooperation.

Reply via email to