If you want to find libimf.so, which is a shared INTEL library,
pass the library path with a -x on mpirun

mpirun .... -x LD_LIBRARY_PATH ....

DM


On Fri, 10 Apr 2009, Francesco Pietra wrote:

Hi Gus:

If you feel that the observations below are not relevant to openmpi,
please disregard the message. You have already kindly devoted so much
time to my problems.

The "limits.h" issue is solved with 10.1.022 intel compilers: as I
felt, the problem was with the pre-10.1.021 version of the intel C++
and ifort compilers, a subtle bug observed also by gentoo people (web
intel). There remains an orted issue.

The openmpi 1.3.1 installation was able to compile connectivity_c.c
and hello_c.c, though, running mpirun (output below between ===):

=================
/usr/local/bin/mpirun -host -n 4 connectivity_c 2>&1 | tee connectivity.out
/usr/local/bin/orted: error while loading shared libraries: libimf.so:
cannot open shared object file: No such file or directory
--------------------------------------------------------------------------
A daemon (pid 8472) died unexpectedly with status 127 while attempting
to launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
--------------------------------------------------------------------------
mpirun: clean termination accomplished
=============

At this point, Amber10 serial compiled nicely (all intel, like
openmpi), but parallel compilation, as expected, returned the same
problem above:

=================
export TESTsander=/usr/local/amber10/exe/sander.MPI; make test.sander.BASIC
make[1]: Entering directory `/usr/local/amber10/test'
cd cytosine && ./Run.cytosine
orted: error while loading shared libraries: libimf.so: cannot open
shared object file: No such file or directory
--------------------------------------------------------------------------
A daemon (pid 8371) died unexpectedly with status 127 while attempting
to launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
--------------------------------------------------------------------------
mpirun: clean termination accomplished

 ./Run.cytosine:  Program error
make[1]: *** [test.sander.BASIC] Error 1
make[1]: Leaving directory `/usr/local/amber10/test'
make: *** [test.sander.BASIC.MPI] Error 2
=====================

Relevant info:

The daemon was not ssh (thus my hypothesis that a firewall on the
router was killing ssh is not the case). During these procedures,
there were only deb64 and deb32 on the local network. On monoprocessor
deb32 (i386) there is nothing of openmpi or amber. Only ssh. Thus, my
.bashrc on deb32 can't correspond to that of deb 64 as far as
libraries are concerned.

echo $LD_LIBRARY_PATH
/opt/intel/mkl/10.1.2.024/lib/em64t:/opt/intel/cce/10.1..022/lib:/opt/intel/fce/10.1.022/lib:/usr/local/lib

# dpkg --search libimf.so
intel-iforte101022: /opt/intel/fce/10.1.022/lib/libimf.so
intel-icce101022: /opt/intel/cce/10.1.022/lib/libimf.so

i.e., libimf.so is on the unix path, still not found by mpirun.

Before compiling I trie to carefully check all env variables and
paths. In particular, as to mpi:

mpif90 -show /opt/intel/fce/10.1.022//bin/ifort -I/usr/local/include
-pthread -I/usr/local/lib -L/usr/local/lib -lmpi_f90 -lmpi_f77 -lmpi
-lopen-rte -lopen-pal -ldl -Wl,--export-dynamic -lnsl -lutil

thanks
francesco



On Thu, Apr 9, 2009 at 9:29 PM, Gus Correa <g...@ldeo.columbia.edu> wrote:
Hi Francesco

Francesco Pietra wrote:

Hi:
As failure to find "limits.h" in my attempted compilations of Amber of
th fast few days (amd64 lenny, openmpi 1.3.1, intel compilers
10.1.015) is probably (or I hope so) a bug of the version used of
intel compilers (I made with debian the same observations reported
for gentoo,
http://software.intel.com/en-us/forums/intel-c-compiler/topic/59886/).

I made a deb package of 10.1.022, icc and ifort.

./configure CC icc, CXX icp,

The Intel C++ compiler is called icpc, not icp.
Is this a typo on your message, or on the actual configure options?

F77 and FC ifort --with-libnuma=/usr (not

/usr/lib so that the numa.h issue is not raised), "make clean",

If you really did "make clean" you may have removed useful things.
What did you do, "make" or "make clean"?

and

"mak install" gave no error signals. However, the compilation tests in
the examples did not pass and I really don't understand why.


Which compilation tests you are talking about?
From Amber or from the OpenMPI example programs (connectivity_c and
hello_c), or both?

The error, with both connectivity_c and hello_c (I was operating on
the parallel computer deb64 directly and have access to everything
there) was failure to find a shared library, libimf.so


To get the right Intel environment,
you need to put these commands inside your login files
(.bashrc or .cshrc), to setup the Intel environment variables correctly:

source /path/to/your/intel/cce/bin/iccvars.sh
source /path/to/your/intel/cce/bin/ifortvars.sh

and perhaps a similar one for mkl.
(I don't use MKL, I don't know much about it).

If your home directory is NFS mounted to all the computers you
use to run parallel programs,
then the same .bashrc/.csrhc will work on all computers.
However, if you have a separate home directory on each computer,
then you need to do this on each of them.
I.e., you have to include the "source" commands above
in the .bashrc/.cshrc files on your home directory in EACH computer.

Also I presume you use bash/sh not tcsh/csh, right?
Otherwise you need to source iccvars.csh instead of iccvars.sh.


# dpkg --search libimf.so
?? /opt/intel/fce/10.1.022/lib/libimf.so ??(the same for cce)

which path seems to be correctly sourced by iccvars.sh and
ifortvars.sh (incidentally, both files are -rw-r--r-- root root;
correct that non executable?)


The permissions here are not a problem.
You are supposed to *source* the files, not to execute them.
If you execute them instead of sourcing the files,
your Intel environment will *NOT* be setup.

BTW, the easy way to check your environment is to type "env" on the
shell command prompt.

echo $LD_LIBRARY_PATH
returned, inter alia,

/opt/intel/mkl/10.1.2.024/lib/em64t:/opt/intel/mkl/10.1.2.024/lib/em64t:/opt/intel/cce/10.1.022/lib:/opt/intel/fce/10.1.022/lib
(why twice the mkl?)


Hard to tell in which computer you were when you did this,
and hence what it should affect.

You man have sourced twice the mkl shell that sets up the MKL environment
variables, which would write its library path more than
once.

When the environment variables get this much confused,
with duplicate paths and so on, you may want to logout
and login again, to start fresh.

Do you need MKL for Amber?
If you don't use it, keep things simple, and don't bother about it.


I surely miss to understand something fundamental. Hope other eyes see
better


Jody helped you run the hello_c program successfully.
Try to repeat carefully the same steps.
You should get the same result,
the OpenMPI test programs should run.

A kind person elsewhere suggested me on passing "The use of -rpath
during linking is highly recommended as opposed to setting
LD_LIBRARY_PATH at run time, not the least because it hardcodes the
paths to the "right" library files in the executables themselves"
Should this be relevant to the present issue, where to learn about
-rpath linking?


If you are talking about Amber,
you would have to tweak the Makefiles to tweak the linker -rpath.
And we don't know much about Amber's Makefiles,
so this may be a very tricky approach.

If you are talking about the OpenMPI test programs,
I think it is just a matter of setting the Intel environment variables
right, sourcing the ifortvars.sh iccvars.sh properly,
to get the right runtime LD_LIBRARY_PATH.

thanks
francesco pietra
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

I hope this helps.
Gus Correa

---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to