I remember having a conversation with someone from R at Supercomputing last year, and this was one of the issues we discussed. The problem is that you have to ensure that R is built against the OMPI you are going to use, and it is usually better to have configured OMPI --disable-dlopen --enable-static to avoid library confusion when you later run R.
I'd give that a try and see if it solves your problems. The "recipe" given by Bennet looked right to me. On Mar 12, 2014, at 12:32 PM, Ross Boylan <r...@biostat.ucsf.edu> wrote: > On Wed, 2014-03-12 at 11:50 +0100, Reuti wrote: >> Am 12.03.2014 um 11:39 schrieb Jeff Squyres (jsquyres): >> >>> Generally, all you need to ensure that your personal copy of OMPI is used >>> is to set the PATH and LD_LIBRARY_PATH to point to your new Open MPI >>> installation. I do this all the time on my development cluster (where I >>> have something like 6 billion different installations of OMPI available... >>> mmm... should probably clean that up...) >>> >>> export LD_LIBRARY_PATH=path_to_my_ompi/lib:$LD_LIBRARY_PATH >>> export PATH=path-to-my-ompi/bin:$PATH > > I believe I've already done that. The script the launches everything is > (all one line originally) > R_PROFILE_USER=~/KHC/sunbelt/Rmpiprofile \ > LD_LIBRARY_PATH=/home/ross/install/lib:$LD_LIBRARY_PATH \ > PATH=/home/ross/install/bin:$PATH orterun -x R_PROFILE_USER -x > LD_LIBRARY_PATH -x PATH -hostfile ~/KHC/sunbelt/hosts \ > -np 7 R --no-save -q > > There is a complication with R; it sticks stuff in front of > LD_LIBRARY_PATH. However, the startup script Rmpiprofile fixes that, > though I'm not entirely sure that is effective. However, the old > libraries that are being loaded are not from any directories R added to > LD_LIBRARY_PATH; instead they are from /usr/lib, which is a standard > place for the dynamic loader to look. > > >>> It should be noted that: >>> >>> 1. you need to *prefix* your PATH and LD_LIBRARY_PATH with these values >>> 2. you need to set these values in a way that will be picked up on all >>> servers that you use in your job. The safest way to do this is in your >>> shell startup files (e.g., $HOME/.bashrc or whatever is relevant for your >>> shell). >> >> I see "libtorque" in the output below - were these jobs running inside a >> queuing system? The set paths might be different therein, and need to be set >> in the job script in this case. >> > No batch system (see script above for launch mechanism). We threw a lot > of stuff MPI configure was looking for onto the system. AFAIK torque > isn't even installed. > > One possible issue is that the Rmpi module for R is not compiled by > mpicc; R has its own notions of proper options for the compiler and its > own infrastructure for building things. I did pass the location of my > local libraries into the build process. > > This seems more like an issue with the dynamic loader, or with whatever > system R is using when it loads Rmpi.so. > > Ross >> -- Reuti >> >> >>> See http://www.open-mpi.org/faq/?category=running#run-prereqs, >>> http://www.open-mpi.org/faq/?category=running#adding-ompi-to-path, and >>> http://www.open-mpi.org/faq/?category=running#mpirun-prefix. >>> >>> Note the --prefix option that is described in the 3rd FAQ item I cited -- >>> that can be a bit easier, too. >>> >>> >>> >>> On Mar 12, 2014, at 2:51 AM, Ross Boylan <r...@biostat.ucsf.edu> wrote: >>> >>>> I took the advice here and built a personal copy of the current openmpi, >>>> to see if the problems I was having with Rmpi were a result of the old >>>> version on the system. >>>> >>>> When I do ldd on the relevant libraries (Rmpi.so is loaded dynamically >>>> by R) everything looks fine; path references that should be local are. >>>> But when I run the program and do lsof it shows that both the system and >>>> personal versions of key libraries are opened. >>>> >>>> First, does anyone know which library will actually be used, or how to >>>> tell which library is actually used, in this situation. I'm running on >>>> linux (Debian squeeze)? >>>> >>>> Second, it there some way to prevent the wrong/old/sytem libraries from >>>> being loaded? >>>> >>>> FWIW I'm still seeing the old misbehavior when I run this way, but, as I >>>> said, I'm really not sure which libraries are being used. Since Rmpi >>>> was built against the new/local ones, I think the fact that it doesn't >>>> crash means I really am using the new ones. >>>> >>>> Here are highlights of lsof on the process running R: >>>> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME >>>> R 17634 ross cwd DIR 254,2 12288 150773764 >>>> /home/ross/KHC/sunbelt >>>> R 17634 ross rtd DIR 8,1 4096 2 / >>>> R 17634 ross txt REG 8,1 5648 3058294 >>>> /usr/lib/R/bin/exec/R >>>> R 17634 ross DEL REG 8,1 2416718 >>>> /tmp/openmpi-sessions-ross@n100_0/60429/1/shared_mem_pool.n100 >>>> R 17634 ross mem REG 8,1 335240 3105336 >>>> /usr/lib/openmpi/lib/libopen-pal.so.0.0.0 >>>> R 17634 ross mem REG 8,1 304576 3105337 >>>> /usr/lib/openmpi/lib/libopen-rte.so.0.0.0 >>>> R 17634 ross mem REG 8,1 679992 3105332 >>>> /usr/lib/openmpi/lib/libmpi.so.0.0.2 >>>> R 17634 ross mem REG 8,1 93936 2967826 >>>> /usr/lib/libz.so.1.2.3.4 >>>> R 17634 ross mem REG 8,1 10648 3187256 >>>> /lib/libutil-2.11.3.so >>>> R 17634 ross mem REG 8,1 32320 2359631 >>>> /usr/lib/libpciaccess.so.0.10.8 >>>> R 17634 ross mem REG 8,1 33368 2359338 >>>> /usr/lib/libnuma.so.1 >>>> R 17634 ross mem REG 254,2 979113 152045740 >>>> /home/ross/install/lib/libopen-pal.so.6.1.0 >>>> R 17634 ross mem REG 8,1 183456 2359592 >>>> /usr/lib/libtorque.so.2.0.0 >>>> R 17634 ross mem REG 254,2 1058125 152045781 >>>> /home/ross/install/lib/libopen-rte.so.7.0.0 >>>> R 17634 ross mem REG 8,1 49936 2359341 >>>> /usr/lib/libibverbs.so.1.0.0 >>>> R 17634 ross mem REG 254,2 2802579 152045867 >>>> /home/ross/install/lib/libmpi.so.1.3.0 >>>> R 17634 ross mem REG 254,2 106626 152046481 >>>> /home/ross/Rlib-3.0.1/Rmpi/libs/Rmpi.so >>>> >>>> So libmpi, libopen-pal, and libopen-rte all are opened in two versions and >>>> two locations. >>>> >>>> Thanks. >>>> Ross Boylan >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> >>> -- >>> Jeff Squyres >>> jsquy...@cisco.com >>> For corporate legal information go to: >>> http://www.cisco.com/web/about/doing_business/legal/cri/ >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users