Sorry, I'll try and fill in the background. I'm attempting to package
openmpi for a number of customers we have, whenever possible on our
clusters we use modules to provide users with a choice of MPI
environment.
I'm using the 1.2.6 stable release and have built the code twice, once
to /opt/openmpi-1.2.6/gnu and once to /opt/openmpi-1.2.6/intel, I have
create two modules environments called openmpi-gnu and openmpi-
intel and
am also using a existing one called intel-compiler. The build was
successful in both cases.
If I load the openmpi-gnu module I can compile and run code using
mpicc/mpirun as expected, if I load openmpi-intel and intel-compiler I
find I can compile code but I get an error about missing libimf.so
when
I try to run it (reproduced below).
The application *will* run if I add the line "module load
intel-compiler" to my bashrc as this allows orted to link. What I
think
I want to do is to compile the actual library with icc but to compile
orted with gcc so that I don't need to load the intel environment by
default. I'm assuming that the link problems only exist with orted
and
not with the actual application as the LD_LIBRARY_PATH is set
correctly
in the shell which is launching the program.
Ashley Pittman.
sccomp@demo4-sles-10-1-fe:~/benchmarks/IMB_3.0/src> mpirun -H
comp00,comp01 ./IMB-MPI1
/opt/openmpi-1.2.6/intel/bin/orted: error while loading shared
libraries: libimf.so: cannot open shared object file: No such file
or directory
/opt/openmpi-1.2.6/intel/bin/orted: error while loading shared
libraries: libimf.so: cannot open shared object file: No such file
or directory
[demo4-sles-10-1-fe:29303] ERROR: A daemon on node comp01 failed to
start as expected.
[demo4-sles-10-1-fe:29303] ERROR: There may be more information
available from
[demo4-sles-10-1-fe:29303] ERROR: the remote shell (see above).
[demo4-sles-10-1-fe:29303] ERROR: The daemon exited unexpectedly
with status 127.
[demo4-sles-10-1-fe:29303] [0,0,0] ORTE_ERROR_LOG: Timeout in file
base/pls_base_orted_cmds.c at line 275
[demo4-sles-10-1-fe:29303] [0,0,0] ORTE_ERROR_LOG: Timeout in file
pls_rsh_module.c at line 1166
[demo4-sles-10-1-fe:29303] [0,0,0] ORTE_ERROR_LOG: Timeout in file
errmgr_hnp.c at line 90
[demo4-sles-10-1-fe:29303] ERROR: A daemon on node comp00 failed to
start as expected.
[demo4-sles-10-1-fe:29303] ERROR: There may be more information
available from
[demo4-sles-10-1-fe:29303] ERROR: the remote shell (see above).
[demo4-sles-10-1-fe:29303] ERROR: The daemon exited unexpectedly
with status 127.
[demo4-sles-10-1-fe:29303] [0,0,0] ORTE_ERROR_LOG: Timeout in file
base/pls_base_orted_cmds.c at line 188
[demo4-sles-10-1-fe:29303] [0,0,0] ORTE_ERROR_LOG: Timeout in file
pls_rsh_module.c at line 1198
----------------------------------------------------------------------
----
mpirun was unable to cleanly terminate the daemons for this job.
Returned value Timeout instead of ORTE_SUCCESS.
----------------------------------------------------------------------
----
$ ldd /opt/openmpi-1.2.6/intel/bin/orted
linux-vdso.so.1 => (0x00007fff877fe000)
libopen-rte.so.0 => /opt/openmpi-1.2.6/intel/lib/libopen-
rte.so.0 (0x00007fe97f3ac000)
libopen-pal.so.0 => /opt/openmpi-1.2.6/intel/lib/libopen-
pal.so.0 (0x00007fe97f239000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007fe97f135000)
libnsl.so.1 => /lib64/libnsl.so.1 (0x00007fe97f01f000)
libutil.so.1 => /lib64/libutil.so.1 (0x00007fe97ef1c000)
libm.so.6 => /lib64/libm.so.6 (0x00007fe97edc7000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fe97ecba000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fe97eba3000)
libc.so.6 => /lib64/libc.so.6 (0x00007fe97e972000)
libimf.so => /opt/intel/compiler_10.1/x86_64/lib/libimf.so
(0x00007fe97e610000)
libsvml.so => /opt/intel/compiler_10.1/x86_64/lib/
libsvml.so (0x00007fe97e489000)
libintlc.so.5 => /opt/intel/compiler_10.1/x86_64/lib/
libintlc.so.5 (0x00007fe97e350000)
/lib64/ld-linux-x86-64.so.2 (0x00007fe97f525000)
$ ssh comp00 ldd /opt/openmpi-1.2.6/intel/bin/orted
libopen-rte.so.0 => /opt/openmpi-1.2.6/intel/lib/libopen-
rte.so.0 (0x00002b1f0c0c5000)
libopen-pal.so.0 => /opt/openmpi-1.2.6/intel/lib/libopen-
pal.so.0 (0x00002b1f0c23e000)
libdl.so.2 => /lib64/libdl.so.2 (0x00002b1f0c3bc000)
libnsl.so.1 => /lib64/libnsl.so.1 (0x00002b1f0c4c0000)
libutil.so.1 => /lib64/libutil.so.1 (0x00002b1f0c5d7000)
libm.so.6 => /lib64/libm.so.6 (0x00002b1f0c6da000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00002b1f0c82f000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00002b1f0c93d000)
libc.so.6 => /lib64/libc.so.6 (0x00002b1f0ca54000)
/lib64/ld-linux-x86-64.so.2 (0x00002b1f0bfa9000)
libimf.so => not found
libsvml.so => not found
libintlc.so.5 => not found
libimf.so => not found
libsvml.so => not found
libintlc.so.5 => not found
$ ldd ./IMB-MPI1
linux-vdso.so.1 => (0x00007fff2cbfe000)
libmpi.so.0 => /opt/openmpi-1.2.6/intel/lib/libmpi.so.0
(0x00007f1624821000)
libopen-rte.so.0 => /opt/openmpi-1.2.6/intel/lib/libopen-
rte.so.0 (0x00007f16246a8000)
libopen-pal.so.0 => /opt/openmpi-1.2.6/intel/lib/libopen-
pal.so.0 (0x00007f1624535000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f1624431000)
libnsl.so.1 => /lib64/libnsl.so.1 (0x00007f162431b000)
libutil.so.1 => /lib64/libutil.so.1 (0x00007f1624218000)
libm.so.6 => /lib64/libm.so.6 (0x00007f16240c3000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f1623fb6000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f1623e9f000)
libc.so.6 => /lib64/libc.so.6 (0x00007f1623c6e000)
libimf.so => /opt/intel/compiler_10.1/x86_64/lib/libimf.so
(0x00007f162390c000)
libsvml.so => /opt/intel/compiler_10.1/x86_64/lib/
libsvml.so (0x00007f1623785000)
libintlc.so.5 => /opt/intel/compiler_10.1/x86_64/lib/
libintlc.so.5 (0x00007f162364c000)
/lib64/ld-linux-x86-64.so.2 (0x00007f16249e0000)
On Mon, 2008-06-09 at 13:02 -0700, Doug Reeder wrote:
Ashley,
I am confused. In your first post you said orted fails, with link
errors, when you try to launch a job. From this I inferred that the
build and install steps for creating openmpi were successful. Was the
build/install step successful. If so what dynamic libraries does ldd
say that orted is using.
Doug Reeder
On Jun 9, 2008, at 12:54 PM, Ashley Pittman wrote:
Putting to side any religious views I might have about static
linking
how would that help in this case? It appears to be orted itself
that
fails to link, I'm assuming that the application would actually run,
either because the LD_LIBRARY_PATH is set correctly on the front
end or
the --prefix option to mpirun.
Or do you mean static linking of the tools? I could go for that if
there is a configure option for it.
Ashley Pittman.
On Mon, 2008-06-09 at 08:27 -0700, Doug Reeder wrote:
Ashley,
It could work but I think you would be better off to try and
statically link the intel libraries.
Doug Reeder
On Jun 9, 2008, at 4:34 AM, Ashley Pittman wrote:
Is there a way to use a different compiler for the orte component
and
the shared library component when using openmpi? We are finding
that if
we use icc to compile openmpi then orted fails with link errors
when I
try and launch a job as the intel environment isn't loaded by
default.
We use the module command heavily and have modules for openmpi-
gnu and
openmpi-intel as well as a intel_compiler module. To use openmpi-
intel
we have to load intel_compiler by default on the compute nodes
which
isn't ideal, is it possible to compile the orte component with
gcc and
the library component with icc?
Yours,
Ashley Pittman,
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users