Re: [OMPI users] "Connection to lifeline lost" when developing a new rsh agent

2012-08-21 Thread Ralph Castain
Have you looked thru the code in orte/mca/plm/rsh/plm_rsh_module.c? It is executing a tree-like spawn pattern by default, but there isn't anything magic about what ssh is doing. However, there are things done to prep the remote shell (setting paths etc.), and the tree spawn passes some additiona

Re: [OMPI users] "Connection to lifeline lost" when developing a new rsh agent

2012-08-21 Thread Yann RADENAC
Le 20/08/2012 15:56, Ralph Castain wrote : > You might try adding "-mca plm_base_verbose 5 --debug-daemons" to watch the debug output from the daemons as they are launched. There seems to be an interference here: my problem is "solved" by enabling option --debug-daemons with a verbose level >

Re: [OMPI users] "Connection to lifeline lost" when developing a new rsh agent

2012-08-20 Thread Ralph Castain
Just to be clear: what you are launching is an orted daemon, not your application process. Once the daemons are running, then we use them to launch the actual application process. So the issue here is with starting the daemons themselves. You might try adding "-mca plm_base_verbose 5 --debug-dae

[OMPI users] "Connection to lifeline lost" when developing a new rsh agent

2012-08-20 Thread Yann RADENAC
Hi, I'm developing MPI support for XtreemOS (www.xtreemos.eu) so that an MPI program is managed as a single XtreemOS job. To manage all processes as a single XtreemOS job, I've developed the program xos-createProcess that plays the role of the rsh agent (replacing ssh/rsh) to start a process