On Apr 26, 2010, at 9:08 AM, Matthew MacManes wrote:
>>
>>>         I am using SGE to submit jobs to one of the TeraGrid sites,
>>>         specifically TACC-RANGER. The problem, is, that I am using a
>>>         program that requires OpenMPI version 1.4.1, and the latest
>>>         install on RANGER is 1.3.1. I was told that I could install
>>>         OpenMPI in my home directory, and run jobs using my newer
>>>         version.. However, I am having problems doing this, getting
>>>         the error message seen below.
>>>
>>>         Its seems that the compute nodes are not accessing all the
>>>         sufficient libraries for the newer version of OpenMPI.
>>>
>>>         Can anybody tell me what I can do to get the jobs running
>>>         using the newer version of OpenMPI. Thanks!
>>>
>>>         TACC: Setting memory limits for job 1349843 to 3984588 KB
>>>         TACC: Dumping job script:
>>>         ------------------------------
>>>         --------------------------------------------------
>>>         #!/bin/bash
>>>         export TMPDIR=$SCRATCH/abyss_tmp/
>>>         LD_LIBRARY_PATH=/work/01301/mmacmane
>>>         LD_LIBRARY_PATH=/work/01301/mmacmane/bin
>>>         LD_LIBRARY_PATH=/work/01301/mmacmane/include
>>>         LD_LIBRARY_PATH=/work/01301/mmacmane/etc
>>>         LD_LIBRARY_PATH=/work/01301/mmacmane/lib
>>>         LD_LIBRARY_PATH=/work/01301/mmacmane/openmpi-1.4.1
>>>         cd /work/01301/mmacmane/Ray-0.0.6
>>>         module load openmpi
>>>         #$ -N testing_MRNA2
>>>         #$ -j y
>>>         #$ -o /work/01301/mmacmane/Ray-0.0.6/testing_MRNA2
>>>         #$ -pe 8way 128
>>>         #$ -q normal   
>>>         #$ -l h_rt=2:00:00   
>>>         #$ -M    macma...@gmail.com <mailto:macma...@gmail.com>
>>>         #$ -m be
>>>         #$ -cwd
>>>         #$ -V
>>>         /work/01301/mmacmane/bin/mpirun Ray
>>>         
>>> /work/01301/mmacmane/Ray-0.0.6/Ray_snp.txt--------------------------------------------------------------------------------
>>>         TACC: Done.
>>>             Module mvapich superceded

Your job script is incorrect. Specifically, define your LD_LIBRARY_PATH
6 different times, with each one overwriting the previous definition:

LD_LIBRARY_PATH=/work/01301/mmacmane
LD_LIBRARY_PATH=/work/01301/mmacmane/bin
LD_LIBRARY_PATH=/work/01301/mmacmane/include
LD_LIBRARY_PATH=/work/01301/mmacmane/etc
LD_LIBRARY_PATH=/work/01301/mmacmane/lib
LD_LIBRARY_PATH=/work/01301/mmacmane/openmpi-1.4.1

After these lines, your LD_LIBRARY_PATH is set to

/work/01301/mmacmane/openmpi-1.4.1

This directory pointless to have in your LD_LIBRARY_PATH, too since that
directory itself won't contain any library files.

The correct syntax to define your LD_LIBRARY_PATH with the above
directories would be this:

LD_LIBRARY_PATH=/work/01301/mmacmane:/work/01301/mmacmane/bin:/work/01301/mmacmane/include:/work/01301/mmacmane/etc:/work/01301/mmacmane/openmpi-1.4.1

But that can be simplified significantly, only one of these files
actually contains library files, /work/01301/mmacmane/lib, so you should
only need this:

LD_LIBRARY_PATH=/work/01301/mmacmane/lib

--
Prentice

Reply via email to