Designation: Non-Export Controlled Content
Another follow up. If I run all proxies on the same node as the dispatcher then 
it works. Even with all sensors spread to different nodes. If I force the 
proxies to another node, they all fail. Here is some more error output.

[cid:image001.png@01D224B2.985D36B0]


3.1.1001
From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of Marlborough, 
Rick
Sent: Wednesday, October 12, 2016 5:44 PM
To: Open MPI Users
Subject: Re: [OMPI users] clarity on Comm_connect


Designation: Non-Export Controlled Content
...forgot to mention...

I have a group of processes called sensors and a group of processes called 
proxies. A central dispatch process launches all of the sensors followed by all 
of the proxies. The sensors publish named ports and wait on MPI_Comm_accept. 
The proxies look up the named port and to a MPI_Comm_connect. If this all 
occurs on the same node as the dispatcher then all proxies connect their 
respective sensor and all is well. If I configure my slots to force proxies or 
sensors onto other nodes(I have 20) then the connections fail. There is full 
connectivity between all of these nodes. We are testing various forms of 
middleware. Some use tcp, some use udp, some use multi-cast. All work. Full ssh 
connectivity is setup between all of these nodes. Oddly enough the sensors all 
perform a Comm_connect to the dispatcher. This always works! The sensors and 
proxies are all spawned in 2 batches using Comm_spawn_multiple.  Error message 
below. Is there some configuration to enable this?

[cid:image001.png@01D224B0.446AA710]


3.1.1001
From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of Marlborough, 
Rick
Sent: Wednesday, October 12, 2016 4:47 PM
To: users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>
Subject: [OMPI users] clarity on Comm_connect


Designation: Non-Export Controlled Content
Folks;
                Trying to do an MPI_Lookup_name. The call is surrounded by a 
try catch block. Even with the try catch block the calling process will still 
abort if the publishing process has not published the name. Is there a way to 
configure/code  to cause MPI to throw a trappable exception?

Thanx
Rick

3.1.1001
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to