I am submitting a job for execution under SGE.  My default shell is /bin/csh.  
The script that is submitted has #!/bin/bash at the top.  The script runs on 
the 1st node allocated to the job.  The script runs a Python wrapper that 
ultimately issues the following mpirun command:

/apps/local/test/openmpi/bin/mpirun --machinefile mpihosts.914 -np 48 -x 
LD_LIBRARY_PATH -x MPI_ENVIRONMENT=1 --mca btl ^tcp --mca 
shmem_mmap_relocate_backing_file -1 --bind-to-core --bycore --mca 
orte_rsh_agent /usr/bin/rsh --mca plm_rsh_disable_qrsh 1 
/apps/local/test/solver/bin/solver_openmpi -cycles 50 -ri restart.0 -i flow.inp 
>& output

Just so there's no confusion, OpenMPI is built without support for SGE.  It 
should be using rsh to launch.

There are 4 nodes involved (each 12 cores, 48 processes total).  In the output 
file, I see 3 sets of messages as shown below.  I assume I am seeing 1 set of 
messages for each of the 3 remote nodes where processes need to be launched:

/bin/.: Permission denied.
OPAL_PREFIX=/apps/local/falcon2014/openmpi: Command not found.
export: Command not found.
PATH=/apps/local/test/openmpi/bin:/bin:/usr/bin:/usr/ccs/bin:/usr/local/bin:/usr/openwin/bin:/usr/local/etc:/home/bloscel/bin:/usr/ucb:/usr/bsd:
 Command not found.
export: Command not found.
LD_LIBRARY_PATH: Undefined variable.

These look like errors you get when csh is trying to parse commands intended 
for bash.

Does anyone know what may be going on here?

Thanks,

Ed

Reply via email to