Which network transport are you using, and what version of Open MPI are you 
using?  Do you have OpenFabrics support compiled into your Open MPI 
installation?

If you're just using TCP and/or shared memory, I can't think of a reason 
immediately as to why this wouldn't work, but there may be a subtle interaction 
in there somewhere that causes badness (e.g., memory corruption).


On Jan 19, 2012, at 1:57 AM, Randolph Pullen wrote:

> 
> I have a section in my code running in rank 0 that must start a perl program 
> that it then connects to via a tcp socket.
> The initialisation section is shown here:
> 
>     sprintf(buf, "%s/session_server.pl -p %d &", PATH,port);
>     int i = system(buf);
>     printf("system returned %d\n", i);
> 
> 
> Some time after I run this code, while waiting for the data from the perl 
> program, the error below occurs:
> 
> qplan connection
> DCsession_fetch: waiting for Mcode data...
> [dc1:05387] [[40050,1],0] ORTE_ERROR_LOG: A message is attempting to be sent 
> to a process whose contact information is unknown in file rml_oob_send.c at 
> line 105
> [dc1:05387] [[40050,1],0] could not get route to [[INVALID],INVALID]
> [dc1:05387] [[40050,1],0] ORTE_ERROR_LOG: A message is attempting to be sent 
> to a process whose contact information is unknown in file 
> base/plm_base_proxy.c at line 86
> 
> 
> It seems that the linux system() call is breaking OpenMPI internal 
> connections.  Removing the system() call and executing the perl code 
> externaly fixes the problem but I can't go into production like that as its a 
> security issue.
> 
> Any ideas ?
> 
> (environment: OpenMPI 1.4.1 on kernel Linux dc1 
> 2.6.18-274.3.1.el5.028stab094.3  using TCP and mpirun)
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to