Trolling through some really old messages that never got replies... :-( The behavior that you are seeing is happening as the result of a really long discussion among the OMPI developers when we were writing the TCP device. The problem is that there is ambiguity when connecting peers across TCP in Open MPI. Specifically, since OMPI can span multiple TCP networks, each MPI process may be able to use multiple IP addresses to each to each other MPI process (and vice versa). So we have to try to figure out which IP addresses can speak to which others.
For example, say that you have a cluster with 16 nodes on a private ethernet network. One of these nodes doubles as the head node for the cluster and therefore has 2 ethernet NICs -- one to the external network and one to the internal cluster network. But since 16 is a nice number, you also want to use it for computation as well. So when you mpirun spanning all 16 nodes, OMPI has to figure out to *not* use the external NIC on the head node and only use the internal NIC. TCP connections are only made upon demand which is why you only see this behavior if two processes actually attempt to communicate via MPI (i.e., "hello world" with no sending/receiving works fine, but adding the MPI_SEND/MPI_RECV makes it fail). We make connections by having all MPI processes exchange their IP address(es) and port number(s) during MPI_INIT (via a common rendevouz point, typically mpirun). Then, whenever a connection is requested between two processes, we apply a small set of rules to all pair combinations of IP addresses of those processes: 1. If the two IP addresses match after the subnet mask is applied, assume that they are mutually routable and allow the connection 2. If the two IP addresses are public, assume that they are mutually routable and allow the connection 3. Otherwise, the connection is disallowed (this is not an error -- we just disallow this connection on the hope that some other device can be used to make a connection). What is happening in your case is that you're falling through to #3 for all IP address pair combinations and there is no other device that can reach these processes. Therefore OMPI thinks that it has no channel to reach the remote process. So it bails (in a horribly non-descriptive way :-( ). We actually have a very long comment about this in the TCP code and mention that your scenario (lots of hosts in a single cluster with private addresses and relatively narrow subnet masks, even though all addresses are, in fact, routable to each other) is not currently supported -- and that we need to do something "better". "Better" in this case probably means having a configuration file that specifies what hosts are mutually routable when the above rules don't work. Do you have any suggestions on this front? On 7/5/06 1:15 PM, "Frank Kahle" <openmpi-u...@fraka-mp.de> wrote: > users-requ...@open-mpi.org wrote: >> A few clarifying questions: >> >> What is your netmask on these hosts? >> >> Where is the MPI_ALLREDUCE in your app -- right away, or somewhere deep >> within the application? Can you replicate this with a simple MPI >> application that essentially calls MPI_INIT, MPI_ALLREDUCE, and >> MPI_FINALIZE? >> >> Can you replicate this with a simple MPI app that does an MPI_SEND / >> MPI_RECV between two processes on the different subnets? >> >> Thanks. >> >> > > @ Jeff, > > netmask 255.255.255.0 > > Running a simple "hello world" yields no error on each subnet, but > running "hello world" on both subnets yields the error > > [g5dual.3-net:00436] *** An error occurred in MPI_Send > [g5dual.3-net:00436] *** on communicator MPI_COMM_WORLD > [g5dual.3-net:00436] *** MPI_ERR_INTERN: internal error > [g5dual.3-net:00436] *** MPI_ERRORS_ARE_FATAL (goodbye) > > Hope this helps! > > Frank > > > Just in case you wanna check the source: > c Fortran example hello_world > program hello > include 'mpif.h' > integer rank, size, ierror, tag, status(MPI_STATUS_SIZE) > character*12 message > > call MPI_INIT(ierror) > call MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierror) > call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierror) > tag = 100 > > if (rank .eq. 0) then > message = 'Hello, world' > do i=1, size-1 > call MPI_SEND(message, 12, MPI_CHARACTER, i, tag, > & MPI_COMM_WORLD, ierror) > enddo > > else > call MPI_RECV(message, 12, MPI_CHARACTER, 0, tag, > & MPI_COMM_WORLD, status, ierror) > endif > > print*, 'node', rank, ':', message > call MPI_FINALIZE(ierror) > end > > > or the full output: > > [powerbook:/Network/CFD/hello] motte% mpirun -d -np 5 --hostfile > ./hostfile /Network/CFD/hello/hello_world > [powerbook.2-net:00606] [0,0,0] setting up session dir with > [powerbook.2-net:00606] universe default-universe > [powerbook.2-net:00606] user motte > [powerbook.2-net:00606] host powerbook.2-net > [powerbook.2-net:00606] jobid 0 > [powerbook.2-net:00606] procid 0 > [powerbook.2-net:00606] procdir: > /tmp/openmpi-sessions-motte@powerbook.2-net_0/default-universe/0/0 > [powerbook.2-net:00606] jobdir: > /tmp/openmpi-sessions-motte@powerbook.2-net_0/default-universe/0 > [powerbook.2-net:00606] unidir: > /tmp/openmpi-sessions-motte@powerbook.2-net_0/default-universe > [powerbook.2-net:00606] top: openmpi-sessions-motte@powerbook.2-net_0 > [powerbook.2-net:00606] tmp: /tmp > [powerbook.2-net:00606] [0,0,0] contact_file > /tmp/openmpi-sessions-motte@powerbook.2-net_0/default-universe/universe-setup. > txt > [powerbook.2-net:00606] [0,0,0] wrote setup file > [powerbook.2-net:00606] pls:rsh: local csh: 1, local bash: 0 > [powerbook.2-net:00606] pls:rsh: assuming same remote shell as local shell > [powerbook.2-net:00606] pls:rsh: remote csh: 1, remote bash: 0 > [powerbook.2-net:00606] pls:rsh: final template argv: > [powerbook.2-net:00606] pls:rsh: /usr/bin/ssh <template> orted > --debug --bootproxy 1 --name <template> --num_procs 6 --vpid_start 0 > --nodename <template> --universe motte@powerbook.2-net:default-universe > --nsreplica "0.0.0;tcp://192.168.2.3:49443" --gprreplica > "0.0.0;tcp://192.168.2.3:49443" --mpi-call-yield 0 > [powerbook.2-net:00606] pls:rsh: launching on node Powerbook.2-net > [powerbook.2-net:00606] pls:rsh: not oversubscribed -- setting > mpi_yield_when_idle to 0 > [powerbook.2-net:00606] pls:rsh: Powerbook.2-net is a LOCAL node > [powerbook.2-net:00606] pls:rsh: changing to directory /Users/motte > [powerbook.2-net:00606] pls:rsh: executing: orted --debug --bootproxy 1 > --name 0.0.1 --num_procs 6 --vpid_start 0 --nodename Powerbook.2-net > --universe motte@powerbook.2-net:default-universe --nsreplica > "0.0.0;tcp://192.168.2.3:49443" --gprreplica > "0.0.0;tcp://192.168.2.3:49443" --mpi-call-yield 0 > [powerbook.2-net:00607] [0,0,1] setting up session dir with > [powerbook.2-net:00607] universe default-universe > [powerbook.2-net:00607] user motte > [powerbook.2-net:00607] host Powerbook.2-net > [powerbook.2-net:00607] jobid 0 > [powerbook.2-net:00607] procid 1 > [powerbook.2-net:00607] procdir: > /tmp/openmpi-sessions-motte@Powerbook.2-net_0/default-universe/0/1 > [powerbook.2-net:00607] jobdir: > /tmp/openmpi-sessions-motte@Powerbook.2-net_0/default-universe/0 > [powerbook.2-net:00607] unidir: > /tmp/openmpi-sessions-motte@Powerbook.2-net_0/default-universe > [powerbook.2-net:00607] top: openmpi-sessions-motte@Powerbook.2-net_0 > [powerbook.2-net:00607] tmp: /tmp > [powerbook.2-net:00606] pls:rsh: launching on node g4d003.3-net > [powerbook.2-net:00606] pls:rsh: not oversubscribed -- setting > mpi_yield_when_idle to 0 > [powerbook.2-net:00606] pls:rsh: g4d003.3-net is a REMOTE node > [powerbook.2-net:00606] pls:rsh: executing: /usr/bin/ssh g4d003.3-net > orted --debug --bootproxy 1 --name 0.0.2 --num_procs 6 --vpid_start 0 > --nodename g4d003.3-net --universe > motte@powerbook.2-net:default-universe --nsreplica > "0.0.0;tcp://192.168.2.3:49443" --gprreplica > "0.0.0;tcp://192.168.2.3:49443" --mpi-call-yield 0 > [g4d003.3-net:00411] [0,0,2] setting up session dir with > [g4d003.3-net:00411] universe default-universe > [g4d003.3-net:00411] user motte > [g4d003.3-net:00411] host g4d003.3-net > [g4d003.3-net:00411] jobid 0 > [g4d003.3-net:00411] procid 2 > [g4d003.3-net:00411] procdir: > /tmp/openmpi-sessions-motte@g4d003.3-net_0/default-universe/0/2 > [g4d003.3-net:00411] jobdir: > /tmp/openmpi-sessions-motte@g4d003.3-net_0/default-universe/0 > [g4d003.3-net:00411] unidir: > /tmp/openmpi-sessions-motte@g4d003.3-net_0/default-universe > [g4d003.3-net:00411] top: openmpi-sessions-motte@g4d003.3-net_0 > [g4d003.3-net:00411] tmp: /tmp > [powerbook.2-net:00606] pls:rsh: launching on node g4d002.3-net > [powerbook.2-net:00606] pls:rsh: not oversubscribed -- setting > mpi_yield_when_idle to 0 > [powerbook.2-net:00606] pls:rsh: g4d002.3-net is a REMOTE node > [powerbook.2-net:00606] pls:rsh: executing: /usr/bin/ssh g4d002.3-net > orted --debug --bootproxy 1 --name 0.0.3 --num_procs 6 --vpid_start 0 > --nodename g4d002.3-net --universe > motte@powerbook.2-net:default-universe --nsreplica > "0.0.0;tcp://192.168.2.3:49443" --gprreplica > "0.0.0;tcp://192.168.2.3:49443" --mpi-call-yield 0 > [powerbook.2-net:00606] pls:rsh: launching on node g4d001.3-net > [powerbook.2-net:00606] pls:rsh: not oversubscribed -- setting > mpi_yield_when_idle to 0 > [powerbook.2-net:00606] pls:rsh: g4d001.3-net is a REMOTE node > [powerbook.2-net:00606] pls:rsh: executing: /usr/bin/ssh g4d001.3-net > orted --debug --bootproxy 1 --name 0.0.4 --num_procs 6 --vpid_start 0 > --nodename g4d001.3-net --universe > motte@powerbook.2-net:default-universe --nsreplica > "0.0.0;tcp://192.168.2.3:49443" --gprreplica > "0.0.0;tcp://192.168.2.3:49443" --mpi-call-yield 0 > [powerbook.2-net:00606] pls:rsh: launching on node G5Dual.3-net > [powerbook.2-net:00606] pls:rsh: not oversubscribed -- setting > mpi_yield_when_idle to 0 > [powerbook.2-net:00606] pls:rsh: G5Dual.3-net is a REMOTE node > [powerbook.2-net:00606] pls:rsh: executing: /usr/bin/ssh G5Dual.3-net > orted --debug --bootproxy 1 --name 0.0.5 --num_procs 6 --vpid_start 0 > --nodename G5Dual.3-net --universe > motte@powerbook.2-net:default-universe --nsreplica > "0.0.0;tcp://192.168.2.3:49443" --gprreplica > "0.0.0;tcp://192.168.2.3:49443" --mpi-call-yield 0 > [g4d001.3-net:00336] [0,0,4] setting up session dir with > [g4d001.3-net:00336] universe default-universe > [g4d001.3-net:00336] user motte > [g4d001.3-net:00336] host g4d001.3-net > [g4d001.3-net:00336] jobid 0 > [g4d001.3-net:00336] procid 4 > [g4d001.3-net:00336] procdir: > /tmp/openmpi-sessions-motte@g4d001.3-net_0/default-universe/0/4 > [g4d001.3-net:00336] jobdir: > /tmp/openmpi-sessions-motte@g4d001.3-net_0/default-universe/0 > [g4d001.3-net:00336] unidir: > /tmp/openmpi-sessions-motte@g4d001.3-net_0/default-universe > [g4d001.3-net:00336] top: openmpi-sessions-motte@g4d001.3-net_0 > [g4d001.3-net:00336] tmp: /tmp > [g4d002.3-net:00279] [0,0,3] setting up session dir with > [g4d002.3-net:00279] universe default-universe > [g4d002.3-net:00279] user motte > [g4d002.3-net:00279] host g4d002.3-net > [g4d002.3-net:00279] jobid 0 > [g4d002.3-net:00279] procid 3 > [g4d002.3-net:00279] procdir: > /tmp/openmpi-sessions-motte@g4d002.3-net_0/default-universe/0/3 > [g4d002.3-net:00279] jobdir: > /tmp/openmpi-sessions-motte@g4d002.3-net_0/default-universe/0 > [g4d002.3-net:00279] unidir: > /tmp/openmpi-sessions-motte@g4d002.3-net_0/default-universe > [g4d002.3-net:00279] top: openmpi-sessions-motte@g4d002.3-net_0 > [g4d002.3-net:00279] tmp: /tmp > [g5dual.3-net:00434] [0,0,5] setting up session dir with > [g5dual.3-net:00434] universe default-universe > [g5dual.3-net:00434] user motte > [g5dual.3-net:00434] host G5Dual.3-net > [g5dual.3-net:00434] jobid 0 > [g5dual.3-net:00434] procid 5 > [g5dual.3-net:00434] procdir: > /tmp/openmpi-sessions-motte@G5Dual.3-net_0/default-universe/0/5 > [g5dual.3-net:00434] jobdir: > /tmp/openmpi-sessions-motte@G5Dual.3-net_0/default-universe/0 > [g5dual.3-net:00434] unidir: > /tmp/openmpi-sessions-motte@G5Dual.3-net_0/default-universe > [g5dual.3-net:00434] top: openmpi-sessions-motte@G5Dual.3-net_0 > [g5dual.3-net:00434] tmp: /tmp > [powerbook.2-net:00613] [0,1,4] setting up session dir with > [powerbook.2-net:00613] universe default-universe > [powerbook.2-net:00613] user motte > [powerbook.2-net:00613] host Powerbook.2-net > [powerbook.2-net:00613] jobid 1 > [powerbook.2-net:00613] procid 4 > [powerbook.2-net:00613] procdir: > /tmp/openmpi-sessions-motte@Powerbook.2-net_0/default-universe/1/4 > [powerbook.2-net:00613] jobdir: > /tmp/openmpi-sessions-motte@Powerbook.2-net_0/default-universe/1 > [powerbook.2-net:00613] unidir: > /tmp/openmpi-sessions-motte@Powerbook.2-net_0/default-universe > [powerbook.2-net:00613] top: openmpi-sessions-motte@Powerbook.2-net_0 > [powerbook.2-net:00613] tmp: /tmp > [g5dual.3-net:00436] [0,1,0] setting up session dir with > [g5dual.3-net:00436] universe default-universe > [g5dual.3-net:00436] user motte > [g5dual.3-net:00436] host G5Dual.3-net > [g5dual.3-net:00436] jobid 1 > [g5dual.3-net:00436] procid 0 > [g5dual.3-net:00436] procdir: > /tmp/openmpi-sessions-motte@G5Dual.3-net_0/default-universe/1/0 > [g5dual.3-net:00436] jobdir: > /tmp/openmpi-sessions-motte@G5Dual.3-net_0/default-universe/1 > [g5dual.3-net:00436] unidir: > /tmp/openmpi-sessions-motte@G5Dual.3-net_0/default-universe > [g5dual.3-net:00436] top: openmpi-sessions-motte@G5Dual.3-net_0 > [g5dual.3-net:00436] tmp: /tmp > [g4d001.3-net:00338] [0,1,1] setting up session dir with > [g4d001.3-net:00338] universe default-universe > [g4d001.3-net:00338] user motte > [g4d001.3-net:00338] host g4d001.3-net > [g4d001.3-net:00338] jobid 1 > [g4d001.3-net:00338] procid 1 > [g4d001.3-net:00338] procdir: > /tmp/openmpi-sessions-motte@g4d001.3-net_0/default-universe/1/1 > [g4d001.3-net:00338] jobdir: > /tmp/openmpi-sessions-motte@g4d001.3-net_0/default-universe/1 > [g4d001.3-net:00338] unidir: > /tmp/openmpi-sessions-motte@g4d001.3-net_0/default-universe > [g4d001.3-net:00338] top: openmpi-sessions-motte@g4d001.3-net_0 > [g4d001.3-net:00338] tmp: /tmp > [g4d003.3-net:00413] [0,1,3] setting up session dir with > [g4d003.3-net:00413] universe default-universe > [g4d003.3-net:00413] user motte > [g4d003.3-net:00413] host g4d003.3-net > [g4d003.3-net:00413] jobid 1 > [g4d003.3-net:00413] procid 3 > [g4d003.3-net:00413] procdir: > /tmp/openmpi-sessions-motte@g4d003.3-net_0/default-universe/1/3 > [g4d003.3-net:00413] jobdir: > /tmp/openmpi-sessions-motte@g4d003.3-net_0/default-universe/1 > [g4d003.3-net:00413] unidir: > /tmp/openmpi-sessions-motte@g4d003.3-net_0/default-universe > [g4d003.3-net:00413] top: openmpi-sessions-motte@g4d003.3-net_0 > [g4d003.3-net:00413] tmp: /tmp > [g4d002.3-net:00281] [0,1,2] setting up session dir with > [g4d002.3-net:00281] universe default-universe > [g4d002.3-net:00281] user motte > [g4d002.3-net:00281] host g4d002.3-net > [g4d002.3-net:00281] jobid 1 > [g4d002.3-net:00281] procid 2 > [g4d002.3-net:00281] procdir: > /tmp/openmpi-sessions-motte@g4d002.3-net_0/default-universe/1/2 > [g4d002.3-net:00281] jobdir: > /tmp/openmpi-sessions-motte@g4d002.3-net_0/default-universe/1 > [g4d002.3-net:00281] unidir: > /tmp/openmpi-sessions-motte@g4d002.3-net_0/default-universe > [g4d002.3-net:00281] top: openmpi-sessions-motte@g4d002.3-net_0 > [g4d002.3-net:00281] tmp: /tmp > [powerbook.2-net:00606] spawn: in job_state_callback(jobid = 1, state = 0x4) > [powerbook.2-net:00606] Info: Setting up debugger process table for > applications > MPIR_being_debugged = 0 > MPIR_debug_gate = 0 > MPIR_debug_state = 1 > MPIR_acquired_pre_main = 0 > MPIR_i_am_starter = 0 > MPIR_proctable_size = 5 > MPIR_proctable: > (i, host, exe, pid) = (0, G5Dual.3-net, > /Network/CFD/hello/hello_world, 436) > (i, host, exe, pid) = (1, g4d001.3-net, > /Network/CFD/hello/hello_world, 338) > (i, host, exe, pid) = (2, g4d002.3-net, > /Network/CFD/hello/hello_world, 281) > (i, host, exe, pid) = (3, g4d003.3-net, > /Network/CFD/hello/hello_world, 413) > (i, host, exe, pid) = (4, Powerbook.2-net, > /Network/CFD/hello/hello_world, 613) > [powerbook.2-net:00613] [0,1,4] ompi_mpi_init completed > [g4d001.3-net:00338] [0,1,1] ompi_mpi_init completed > [g5dual.3-net:00436] [0,1,0] ompi_mpi_init completed > [g4d003.3-net:00413] [0,1,3] ompi_mpi_init completed > [g4d002.3-net:00281] [0,1,2] ompi_mpi_init completed > node 1 :Hello, world > node 2 :Hello, world node 3 :Hello, world > [g5dual.3-net:00436] *** An error occurred in MPI_Send > > [g5dual.3-net:00436] *** on communicator MPI_COMM_WORLD > [g5dual.3-net:00436] *** MPI_ERR_INTERN: internal error > [g5dual.3-net:00436] *** MPI_ERRORS_ARE_FATAL (goodbye) > -------------------------------------------------------------------------- > WARNING: A process refused to die! > > Host: powerbook.2-net > PID: 613 > > This process may still be running and/or consuming resources. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > WARNING: A process refused to die! > > Host: g4d003.3-net > PID: 413 > > This process may still be running and/or consuming resources. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > WARNING: A process refused to die! > > Host: g5dual.3-net > PID: 436 > > This process may still be running and/or consuming resources. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > WARNING: A process refused to die! > > Host: g4d002.3-net > PID: 281 > > This process may still be running and/or consuming resources. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > WARNING: A process refused to die! > > Host: g4d001.3-net > PID: 338 > > This process may still be running and/or consuming resources. > -------------------------------------------------------------------------- > [g5dual.3-net:00434] sess_dir_finalize: found proc session dir empty - > deleting > [g5dual.3-net:00434] sess_dir_finalize: found job session dir empty - > deleting > [g5dual.3-net:00434] sess_dir_finalize: univ session dir not empty - leaving > [powerbook.2-net:00607] orted: job_state_callback(jobid = 1, state = > ORTE_PROC_STATE_ABORTED) > [g5dual.3-net:00434] orted: job_state_callback(jobid = 1, state = > ORTE_PROC_STATE_ABORTED) > [g4d003.3-net:00411] orted: job_state_callback(jobid = 1, state = > ORTE_PROC_STATE_ABORTED) > [g4d001.3-net:00336] orted: job_state_callback(jobid = 1, state = > ORTE_PROC_STATE_ABORTED) > [g5dual.3-net:00434] sess_dir_finalize: job session dir not empty - leaving > [g5dual.3-net:00434] sess_dir_finalize: found proc session dir empty - > deleting > [g5dual.3-net:00434] sess_dir_finalize: found job session dir empty - > deleting > [g5dual.3-net:00434] sess_dir_finalize: found univ session dir empty - > deleting > [g5dual.3-net:00434] sess_dir_finalize: found top session dir empty - > deleting > [g4d002.3-net:00279] orted: job_state_callback(jobid = 1, state = > ORTE_PROC_STATE_ABORTED) > [g4d002.3-net:00279] sess_dir_finalize: found job session dir empty - > deleting > [g4d002.3-net:00279] sess_dir_finalize: univ session dir not empty - leaving > [g4d002.3-net:00279] sess_dir_finalize: proc session dir not empty - leaving > -------------------------------------------------------------------------- > WARNING: A process refused to die! > > Host: g4d002.3-net > PID: 281 > > This process may still be running and/or consuming resources. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > WARNING: A process refused to die! > > Host: g4d002.3-net > PID: 281 > > This process may still be running and/or consuming resources. > -------------------------------------------------------------------------- > [g4d002.3-net:00279] sess_dir_finalize: found proc session dir empty - > deleting > [g4d002.3-net:00279] sess_dir_finalize: found job session dir empty - > deleting > [g4d002.3-net:00279] sess_dir_finalize: found univ session dir empty - > deleting > [g4d002.3-net:00279] sess_dir_finalize: found top session dir empty - > deleting > [powerbook.2-net:00607] sess_dir_finalize: found job session dir empty - > deleting > [powerbook.2-net:00607] sess_dir_finalize: univ session dir not empty - > leaving > [powerbook.2-net:00607] sess_dir_finalize: proc session dir not empty - > leaving > -------------------------------------------------------------------------- > WARNING: A process refused to die! > > Host: powerbook.2-net > PID: 613 > > This process may still be running and/or consuming resources. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > WARNING: A process refused to die! > > Host: powerbook.2-net > PID: 613 > > This process may still be running and/or consuming resources. > -------------------------------------------------------------------------- > [powerbook.2-net:00607] sess_dir_finalize: found proc session dir empty > - deleting > [powerbook.2-net:00607] sess_dir_finalize: job session dir not empty - > leaving > [g4d001.3-net:00336] sess_dir_finalize: found job session dir empty - > deleting > [g4d001.3-net:00336] sess_dir_finalize: univ session dir not empty - leaving > [g4d001.3-net:00336] sess_dir_finalize: proc session dir not empty - leaving > -------------------------------------------------------------------------- > WARNING: A process refused to die! > > Host: g4d001.3-net > PID: 338 > > This process may still be running and/or consuming resources. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > WARNING: A process refused to die! > > Host: g4d001.3-net > PID: 338 > > This process may still be running and/or consuming resources. > -------------------------------------------------------------------------- > [g4d001.3-net:00336] sess_dir_finalize: found proc session dir empty - > deleting > [g4d001.3-net:00336] sess_dir_finalize: found job session dir empty - > deleting > [g4d001.3-net:00336] sess_dir_finalize: found univ session dir empty - > deleting > [g4d001.3-net:00336] sess_dir_finalize: found top session dir empty - > deleting > [g4d003.3-net:00411] sess_dir_finalize: found job session dir empty - > deleting > [g4d003.3-net:00411] sess_dir_finalize: univ session dir not empty - leaving > [g4d003.3-net:00411] sess_dir_finalize: proc session dir not empty - leaving > -------------------------------------------------------------------------- > WARNING: A process refused to die! > > Host: g4d003.3-net > PID: 413 > > This process may still be running and/or consuming resources. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > WARNING: A process refused to die! > > Host: g4d003.3-net > PID: 413 > > This process may still be running and/or consuming resources. > -------------------------------------------------------------------------- > 1 process killed (possibly by Open MPI) > [g4d003.3-net:00411] orted: job_state_callback(jobid = 1, state = > ORTE_PROC_STATE_TERMINATED) > [g4d003.3-net:00411] sess_dir_finalize: found proc session dir empty - > deleting > [g4d003.3-net:00411] sess_dir_finalize: found job session dir empty - > deleting > [g4d003.3-net:00411] sess_dir_finalize: found univ session dir empty - > deleting > [g4d003.3-net:00411] sess_dir_finalize: found top session dir empty - > deleting > [powerbook:/Network/CFD/hello] motte% > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Server Virtualization Business Unit Cisco Systems