Yes I am, (master and child 1 running on the same machine).
But knowing the oversubscribing issue, I am using mpi_yield_when_idle
which should fix precisely this problem, right?
Or is the option ignored,when initially there is no second process? I
did give both machines multiple slots, so OpenMPI
"knows" that the possibility for more oversubscription may arise.
Confused,
Murat


Jeff Squyres schrieb:
> Are you perchance oversubscribing your nodes?
>
> Open MPI does not currently handle well when you initially  
> undersubscribe your nodes but then, due to spawning, oversubscribe  
> your nodes.  In this case, OMPI will be aggressively polling in all  
> processes, not realizing that the node is now oversubscribed and it  
> should be yielding the processor so that other processes can run.
>
> On Oct 30, 2007, at 10:57 AM, Murat Knecht wrote:
>
>   
>> Hi,
>>
>> does someone know whether there is a special requirement on the  
>> order of
>> spawning processes and the consequent merge of the intercommunicators?
>> I have two hosts, let's name them local and remote, and a parent  
>> process
>> on local that goes on spawning one process on each one of the two  
>> nodes.
>> After each spawn the parent process and all existing childs  
>> participate
>> in merging the created Intercommunicator into an Intracommunicator  
>> that
>> connects - in the end - alls three processes.
>>
>> The weird thing is though, when I spawn them in the order local,  
>> remote
>> at the second, the last spawn all three processes block when
>> encountering MPI_Merge. Though, when I switch the order around to
>> spawning first the process on remote and then on local, everything  
>> works
>> out: The two processes are spawned and the Intracommunicators created
>> from the Merge. Everything goes well, too, if I decide to spawn both
>> processes on either one of the machines. (The existing children are
>> informed via a message that they shall participate in the Spawn and
>> Merge since these are collective operations.)
>>
>> Is there some implicit developer-level knowledge that explains why the
>> order defines the outcome? Logically, there ought to be no difference.
>> Btw, I work with two Linux nodes and an ordinary Ethernet-TCP  
>> connection
>> between them.
>>
>> Thanks,
>> Murat
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>     
>
>
>   

Reply via email to