Am 26.06.2012 um 15:16 schrieb Semi:
> I put this entry in rungms:
>
> set TARGET=mpi
> /storage/openmpi-1.5_openib/bin/mpirun -np $NPROCS
> /storage/app/ymiller/gamess_openib/gamess.$VERNO.x $JOB
>
> when I run:
> ./rungms exam01 00 20 > & exam01.log
> it works. log attached
>
> when I try run it via SGE in such way I get error:
> #!/bin/sh
> #$ -N test
> #$ -o /storage/app/ymiller/gamess_openib/TEST/test.o -e
> /storage/app/ymiller/gamess_openib/TEST/test.e
> #$ -m ea
> #$ -A gamess_parallel
> #$ -R y
> #$ -pe ompi 10
> export GAMESS=/storage/app/ymiller/gamess_openib
> export LD_LIBRARY_PATH=$GAMESS/lib${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}
> cd $TMPDIR
> cp $GAMESS/tests/exam01.inp exam01.F05
> JOB=exam01
> CUSTOM_TMPDIR=$GAMESS/TEST
> BINARY_LOCATION=$GAMESS
> . $GAMESS/subgms_export
If you use `rungms`, you don't need to set my variables before. It's a
contradiction.
> unset JOB
> unset CUSTOM_TMPDIR
> unset BINARY_LOCATION
> rm -f $IRCDATA
> rm -f $PUNCH
> rm -f $SIMEN
> rm -f $SIMCOR
>
> HOSTFILE=$TMPDIR/machines
> awk '{ for (i=0;i<$2;++i) {print $1} }' $PE_HOSTFILE >> $HOSTFILE
>
> $GAMESS/rungms $GAMESS/tests/exam01 00 $NSLOTS -scr $TMPDIR < /dev/null >
> $GAMESS/TEST/exam01.out
>
> more exam01.out
> ----- GAMESS execution script 'rungms' -----
> This job is running on host sge177
> under operating system Linux at Tue Jun 26 14:44:09 IDT 2012
> SGE has assigned the following compute nodes to this run:
> sge177
> Available scratch disk space (Kbyte units) at beginning of the job is
> Filesystem 1K-blocks Used Available Use% Mounted on
> /dev/sda1 206424760 6633828 189305172 4% /
> Copying input file /storage/app/ymiller/gamess_openib/tests/exam01.inp to
> your run's scratch directory...
>
>
> ERROR OPENING PRE-EXISTING FILE INPUT,
> ASSIGNED TO EXPLICIT FILE NAME
> /tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.F05,
> PLEASE CHECK THE -SETENV- FILE ASSIGNMENTS IN YOUR -RUNGMS- SCRIPT.
> EXECUTION OF GAMESS TERMINATED -ABNORMALLY- AT Tue Jun 26 14:44:10 2012
> CPU 0: STEP CPU TIME= 0.00 TOTAL CPU TIME= 0.0 ( 0.0 MIN)
> TOTAL WALL CLOCK TIME= 0.0 SECONDS, CPU UTILIZATION IS 100.00%
> more test.e
> cp /storage/app/ymiller/gamess_openib/tests/exam01.inp
> /tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openi
> b/tests/exam01.F05
> cp: cannot create regular file
> `/tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.F05':
> No
Did you solve it in the meantime?
The /tmp/6053922.1.bioinfo.q will be created by SGE, but not the rest of the
mentioned subdirectories. I wonder where this is added to the path. It looks
like $GAMESS is added always to the paths.
> such file or directory
> unset echo
> setenv AUXDATA ./auxdata
Instead of "." I suggest to put the actual (absolute) path to GAMESS in
`rungms`.
> setenv EXTBAS /dev/null
> setenv NUCBAS /dev/null
> setenv POSBAS /dev/null
> setenv ERICFMT ./ericfmt.dat
> setenv MCPPATH ./mcpdata
> setenv BASPATH ./auxdata/BASES
> setenv QUANPOL ./auxdata/QUANPOL
> setenv MAKEFP .///storage/app/ymiller/gamess_openib/tests/exam01.efp
This is:
setenv MAKEFP ~$USER/scr/$JOB.efp
Did you edited the line by hand?
> setenv GAMMA .///storage/app/ymiller/gamess_openib/tests/exam01.gamma
> setenv TRAJECT .///storage/app/ymiller/gamess_openib/tests/exam01.trj
> setenv RESTART .///storage/app/ymiller/gamess_openib/tests/exam01.rst
> setenv INPUT
> /tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.F05
> setenv PUNCH .///storage/app/ymiller/gamess_openib/tests/exam01.dat
> setenv AOINTS
> /tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.F08
> .......................................
> setenv GMCCCS
> /tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.F99
> unset echo
> grep:
> /tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.F05:
> No such file or directory
> grep:
> /tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.F05:
> No such file or directory
> grep:
> /tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.F05:
> No such file or directory
> grep:
> /tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.F05:
> No such file or directory
There are exactly 4 `grep`s in `rungms`. Due to the wrong path it's not found.
-- Reuti
> DDI Process 0: error code 911
> --------------------------------------------------------------------------
> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
> with errorcode 911.
>
> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> You may or may not see output from other processes, depending on
> exactly when Open MPI kills them.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> mpirun has exited due to process rank 0 with PID 30229 on
> node sge177 exiting improperly. There are two reasons this could occur:
>
> 1. this process did not call "init" before exiting, but others in
> the job did. This can cause a job to hang indefinitely while it waits
> for all processes to call "init". By rule, if one process calls "init",
> then ALL processes must call "init" prior to termination.
>
> 2. this process called "init", but exited without calling "finalize".
> By rule, all processes that call "init" MUST call "finalize" prior to
> exiting or it will be considered an "abnormal termination"
>
> This may have caused other processes in the application to be
> terminated by signals sent by mpirun (as reported here).
> --------------------------------------------------------------------------
> touch: cannot touch
> `/tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.nodes.mpd':
> No such
> file or directory
> uniq:
> /tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.nodes.mpd:
> No such file or directo
> ry
> uniq: write error: No such file or directory
> wc:
> /tmp/6053922.1.bioinfo.q//storage/app/ymiller/gamess_openib/tests/exam01.nodes.mpd:
> No such file or directory
> NNODES: Subscript out of range.
>
> On 6/25/2012 3:11 PM, Reuti wrote:
>> Am 25.06.2012 um 14:03 schrieb Dave Love:
>>
>>
>>> Reuti <[email protected]>
>>> writes:
>>>
>>>
>>>> Well, we also use GAMESS sometimes but just with the default socket
>>>> communication.
>>>>
>>> Did you ever manage to get that tightly integrated? Alternatively, is
>>> there a good reason not to use the MPI support I seem to remember it has
>>> now?
>>>
>> No, only the last state we talked about a year ago or so: it starts tightly
>> integrated (i.e. with `qrsh -inherit ...` and ssh/rsh completely disabled in
>> the cluster), but then jumps out of the process tree and there is no
>> accounting.
>>
>> I found the manual with the explanation about the MPI data servers quite
>> confusing, and as as we use it only once in a while I didn't spend more time
>> on it.
>>
>> -- Reuti
>>
>>
>>
>>> --
>>> Community Grid Engine:
>>> http://arc.liv.ac.uk/SGE/
>>
>> _______________________________________________
>> users mailing list
>>
>> [email protected]
>> https://gridengine.org/mailman/listinfo/users
>
>
> <exam01.log.rtf>
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users