Rolf Vandevaart wrote:
> I think what you are looking for is this:
> 
> --mca plm_rsh_disable_qrsh 1
> 
> This means we will disable the use of qrsh and use rsh or ssh instead.
> 
> The --mca pls ^sge does not work anymore for two reasons.  First, the
> "pls" framework was renamed "plm".  Secondly, the gridgengine plm was
> folded into the rsh/ssh one.
> 

Rolf,

Thanks for the quick reply.  That solved the problem.

Craig


> A few more details at
> http://www.open-mpi.org/faq/?category=running#run-n1ge-or-sge
> 
> Rolf
> 
> On 07/23/09 10:34, Craig Tierney wrote:
>> I have built OpenMPI 1.3.3 without support for SGE.
>> I just want to launch jobs with loose integration right
>> now.
>>
>> Here is how I configured it:
>>
>> ./configure CC=pgcc CXX=pgCC F77=pgf90 F90=pgf90 FC=pgf90
>> --prefix=/opt/openmpi/1.3.3-pgi --without-sge
>>  --enable-io-romio --with-openib=/opt/hjet/ofed/1.4.1
>> --with-io-romio-flags=--with-file-system=lustre
>> --enable-orterun-prefix-by-default
>>
>> I can start jobs from the commandline just fine.  When
>> I try to do the same thing inside an SGE job, I get
>> errors like the following:
>>
>>
>> error: executing task of job 5041155 failed:
>> --------------------------------------------------------------------------
>>
>> A daemon (pid 13324) died unexpectedly with status 1 while attempting
>> to launch so we are aborting.
>>
>> There may be more information reported by the environment (see above).
>>
>> This may be because the daemon was unable to find all the needed shared
>> libraries on the remote node. You may set your LD_LIBRARY_PATH to have
>> the
>> location of the shared libraries on the remote nodes and this will
>> automatically be forwarded to the remote nodes.
>> --------------------------------------------------------------------------
>>
>> --------------------------------------------------------------------------
>>
>> mpirun noticed that the job aborted, but has no info as to the process
>> that caused that situation.
>> --------------------------------------------------------------------------
>>
>> mpirun: clean termination accomplished
>>
>>
>> I am starting mpirun with the following options:
>>
>> $OMPI/bin/mpirun -mca btl openib,sm,self --mca pls ^sge \
>>     -machinefile $MACHINE_FILE -x LD_LIBRARY_PATH -np 16 ./xhpl
>>
>> The options are to ensure I am using IB, that SGE is not used, and that
>> the LD_LIBRARY_PATH is sent along to ensure dynamic linking is done
>> correctly.
>>
>> This worked with 1.2.7 (except setting the pls option as gridengine
>> instead of sge), but I can't get it to work with 1.3.3.
>>
>> Am I missing something obvious for getting jobs with loose integration
>> started?
>>
>> Thanks,
>> Craig
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 


-- 
Craig Tierney (craig.tier...@noaa.gov)

Reply via email to