I have a section in my code running in rank 0 that must start a perl program 
that it then connects to via a tcp socket.
The initialisation section is shown here:

    sprintf(buf, "%s/session_server.pl -p %d &", PATH,port);
    int i = system(buf);
    printf("system returned %d\n", i);


Some time after I run this code, while waiting for the data from the perl 
program, the error below occurs:

qplan connection
DCsession_fetch: waiting for Mcode data...
[dc1:05387] [[40050,1],0] ORTE_ERROR_LOG: A message is attempting to be sent to 
a process whose contact information is unknown in file rml_oob_send.c at line 
105
[dc1:05387] [[40050,1],0] could not get route to [[INVALID],INVALID]
[dc1:05387] [[40050,1],0] ORTE_ERROR_LOG: A message is attempting to be sent to 
a process whose contact information is unknown in file base/plm_base_proxy.c at 
line 86


It seems that the linux system() call is breaking OpenMPI internal connections. 
 Removing the system() call and executing the perl code externaly fixes the 
problem but I can't go into production like that as its a security issue.

Any ideas ?


(environment: OpenMPI 1.4.1 on kernel Linux dc1 2.6.18-274.3.1.el5.028stab094.3 
 using TCP and mpirun)

Reply via email to