Hello,

On Tue, Jun 22, 2010 at 8:05 AM, Ralph Castain <r...@open-mpi.org> wrote:
> Sorry for the problem - the issue is a bug in the handling of the
>pernode option in 1.4.2. This has been fixed and awaits release in
>1.4.3.
>

Thank you for pointing this out.  Unfortunately, I still am not able
to start remote processes::

  $ mpirun --host compute-0-11 -np 1 ./hello_mpi
  --------------------------------------------------------------------------
  mpirun noticed that the job aborted, but has no info as to the process
  that caused that situation.
  --------------------------------------------------------------------------

The same program runs fine if I use "--host localhost".

Doing a "strace -v" on the "mpirun" invocation shows a strange
invocation of "orted"::

 execve("//usr/bin/ssh", ["/usr/bin/ssh", "-x", "compute-0-11",
        " orted", "--daemonize", "-mca", "ess", "env",
        "-mca", "orte_ess_jobid", "2322006016", "-mca",
        "orte_ess_vpid", "1", "-mca", "orte_ess_num_procs", "2",
        "--hnp-uri", "\"2322006016.0;tcp://192.168.122.1"],
        ["MKLROOT=/opt/intel/mkl/10.0.3.02", ...])

Indeed, the 192.168.122.1 address is connected to an internal Xen
bridge "virbr0", so it should not appear as a "call-back" address.
Is there a command-line option to force mpirun to use a certain IP address?
I have tried starting "mpirun" with "--mca btl_tcp_if_exclude lo,virbr0"
to no avail.

Also, the " orted" argument to ssh starts with a space; is this OK?

I'm using OMPI 1.4.2,  self-compiled on a Rocks 5.2 (i.e., CentOS 5.2) cluster

Regards,
Riccardo

Reply via email to