Dear Ralph and other users
I tried both versions with the relative path and with the -wdir option
but in both cases the error is still the same. Additionally I tried to
simply start the job in my home directory but it does not help either
... any other ideas?
thx
Bernhard
[bknapp@quoVadis04 testSet]$ mpirun -np 8 -machinefile
/home/bknapp/scripts/machinefile.txt mdrun -np 8 -nice 0 -s
gromacsRuns/testSet/1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.tpr -o
gromacsRuns/testSet/1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.trr -c
gromacsRuns/testSet/1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.pdb -g
gromacsRuns/testSet/1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.log -e
gromacsRuns/testSet/1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.edr -v
[bknapp@quoVadis04 testSet]$ mpirun -np 8 -machinefile
/home/bknapp/scripts/machinefile.txt mdrun -np 8 -nice 0 -s
1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.tpr -o
1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.trr -c
1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.pdb -g
1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.log -e
1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.edr -v -wdir
/home/bknapp/gromacsRuns/testSet/
Back Off! I just backed up 1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.log
to ./#1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.log.15#
Getting Loaded...
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode -1.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
-------------------------------------------------------
Program mdrun, VERSION 4.0.3
Source code file: gmxfio.c, line: 736
Can not open file:
1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.tpr
-------------------------------------------------------
"My Brothers are Protons (Protons!), My Sisters are Neurons (Neurons)"
(Gogol Bordello)
Error on node 0, will try to stop all the nodes
Halting parallel program mdrun on CPU 0 out of 8
gcq#318: "My Brothers are Protons (Protons!), My Sisters are Neurons
(Neurons)" (Gogol Bordello)
--------------------------------------------------------------------------
mpirun has exited due to process rank 0 with PID 4313 on
node 192.168.0.103 exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
Ralph wrote:
I assume you are running in a non-managed environment and so are using
ssh for your launcher? Could you tell us what version of OMPI you are
using?
The problem is that ssh drops you in your home directory, not your
current working directory. Thus, the path to any file you specify must
be relative to your home directory. Alternatively, you can specify the
desired current working directory on the mpirun cmd line. Do a "man
mpirun" to find the specific option.
I'd have to check, but we may have corrected this in recent versions
(or a soon-to-be-released one) so that we automatically move you to
the cwd after the daemon is started. However, I know that we didn't do
that in some earlier versions - perhaps in the 1.2.x series as well.
Ralph
On Apr 7, 2009, at 5:05 AM, Bernhard Knapp wrote:
Hi
I am trying to get a parallel job of the gromacs software started.
MPI seems to boot fine but unfortunately it seems not to be able to
open a specified file although it is definitly in the directory
where the job is started. I also changed the file permissions to 777
but it does not affect the result. Any suggestions?
cheers
Bernhard
[bknapp_at_quoVadis04 testSet]$ mpirun -np 8 -machinefile /home/bknapp/
scripts/machinefile.txt mdrun -np 8 -nice 0 -s
1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.tpr -o
1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.trr -c
1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.pdb -g
1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.log -e
1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.edr -v
bknapp_at_192.168.0.103's password:
NNODES=8, MYRANK=1, HOSTNAME=quoVadis04
NNODES=8, MYRANK=3, HOSTNAME=quoVadis04
NNODES=8, MYRANK=7, HOSTNAME=quoVadis04
NNODES=8, MYRANK=0, HOSTNAME=quoVadis03
NNODES=8, MYRANK=4, HOSTNAME=quoVadis03
NNODES=8, MYRANK=6, HOSTNAME=quoVadis03
NODEID=4 argc=16
NNODES=8, MYRANK=2, HOSTNAME=quoVadis03
NODEID=1 argc=16
NODEID=3 argc=16
NODEID=7 argc=16
NODEID=2 argc=16
NODEID=6 argc=16
NODEID=0 argc=16
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode -1.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
-------------------------------------------------------
Program mdrun, VERSION 4.0.3
Source code file: gmxfio.c, line: 736
Can not open file:
1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.tpr
-------------------------------------------------------
"I Need a Little Poison" (Throwing Muses)
Error on node 0, will try to stop all the nodes
Halting parallel program mdrun on CPU 0 out of 8
gcq#108: "I Need a Little Poison" (Throwing Muses)
--------------------------------------------------------------------------
mpirun has exited due to process rank 0 with PID 3777 on
node 192.168.0.103 exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
[bknapp_at_quoVadis04 testSet]$ ll
1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.tpr
-rwxrwxrwx 1 bknapp bknapp 6118424 2009-03-13 09:44
1fyt_PKYVKQNTLELAT_bindingRegionsOnly.md.tpr