Siegmar,

can you run
LD_LIBRARY_PATH= LD_LIBRARY_PATH64= /usr/bin/ssh
on all your boxes ?

the root cause could be you try to run ssh on box A with the env of box B

can you also run with the -output-tag (or -tag-output) so we can figure out
on which box ssh is failing

Cheers,

Gilles

On Friday, May 15, 2015, Siegmar Gross <siegmar.gr...@informatik.hs-fulda.de>
wrote:

> Hi,
>
> I successfully installed openmpi-1.8.5 on my machines (Solaris 10
> Sparc, Solaris 10 x86_64, and openSUSE Linux 12.1 x86_64) with
> gcc-4.9.2 and Sun C 5.13. I get the same error for both compilers,
> if I use the following command and no errors if I change the order
> of the first two machines. I also get no errors if I use
> openmpi-dev-1708-g8497a6a for an arbitrary order of the machines.
>
>
> tyr hello_1 109 which mpicc
> /usr/local/openmpi-1.8.5_64_cc/bin/mpicc
> tyr hello_1 110 mpiexec -np 5 -host sunpc1,linpc1,tyr,rs0 hello_1_mpi
> ld.so.1: ssh: fatal: relocation error: file /usr/bin/ssh: symbol
> SUNWcry_installed: referenced symbol not found
> --------------------------------------------------------------------------
> ORTE was unable to reliably start one or more daemons.
> This usually is caused by:
>
> * not finding the required libraries and/or binaries on
>   one or more nodes. Please check your PATH and LD_LIBRARY_PATH
>   settings, or configure OMPI with --enable-orterun-prefix-by-default
>
> * lack of authority to execute on one or more specified nodes.
>   Please verify your allocation and authorities.
>
> * the inability to write startup files into /tmp
> (--tmpdir/orte_tmpdir_base).
>   Please check with your sys admin to determine the correct location to
> use.
>
> *  compilation of the orted with dynamic libraries when static are required
>   (e.g., on Cray). Please check your configure cmd line and consider using
>   one of the contrib/platform definitions for your system type.
>
> * an inability to create a connection back to mpirun due to a
>   lack of common network interfaces and/or no route found between
>   them. Please check network connectivity (including firewalls
>   and network routing requirements).
> --------------------------------------------------------------------------
>
>
>
> Now the program hangs and "top" shows that "orterun" is very busy.
>
>    PID USERNAME THR PR NCE  SIZE   RES STATE   TIME FLTS    CPU COMMAND
>  29550 fd1026     2  0   0 14.5M 8576K cpu01   1:06    0 47.72% orterun
>
>
>
>
> tyr hello_1 116 mpiexec -np 5 -host linpc1,sunpc1,tyr,rs0 hello_1_mpi
> Process 2 of 5 running on sunpc1
> Process 4 of 5 running on rs0.informatik.hs-fulda.de
> Process 3 of 5 running on tyr.informatik.hs-fulda.de
> Process 1 of 5 running on linpc1
> Process 0 of 5 running on linpc1
> ...
>
>
>
> Everything works fine with openmpi-dev-1708-g8497a6a.
>
> tyr hello_1 120 which mpicc
> /usr/local/openmpi-1.9.0_64_gcc/bin/mpicc
> tyr hello_1 121 mpiexec -np 5 -host sunpc1,linpc1,tyr,rs0 hello_1_mpi
> Process 2 of 5 running on linpc1
> Process 0 of 5 running on sunpc1
> Process 1 of 5 running on sunpc1
> Process 4 of 5 running on rs0.informatik.hs-fulda.de
> Process 3 of 5 running on tyr.informatik.hs-fulda.de
> ...
>
>
> Any ideas what's going wrong? I would be grateful if somebody can
> fix the problem. Thank you very much for any help in advance.
>
>
> Kind regards
>
> Siegmar
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org <javascript:;>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/05/26871.php
>

Reply via email to