Hi there, I am trying to run an MD simulation on a 13 residue peptide with distance restraints. Before, I ran into a problem with this system when an error message occurred during mdrun concerning distance restraints and domain decomposition. There was apparently a bug in mshift.c and the bug was fixed. However, at this point, my simulation works properly only if I use the serial version of mdrun. When I try to run it in parallel, I get this error message:
NNODES=4, MYRANK=0, HOSTNAME=chong06.chem.pitt.edu NNODES=4, MYRANK=2, HOSTNAME=chong06.chem.pitt.edu NNODES=4, MYRANK=3, HOSTNAME=chong06.chem.pitt.edu NNODES=4, MYRANK=1, HOSTNAME=chong06.chem.pitt.edu NODEID=0 argc=13 NODEID=3 argc=13 NODEID=2 argc=13 NODEID=1 argc=13 :-) G R O M A C S (-: Grunge ROck MAChoS :-) VERSION 4.0.2 (-: Written by David van der Spoel, Erik Lindahl, Berk Hess, and others. Copyright (c) 1991-2000, University of Groningen, The Netherlands. Copyright (c) 2001-2008, The GROMACS development team, check out http://www.gromacs.org for more information. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. :-) /home/map110/gromacs.4.0.2patched/bin/mdrunmpi (-: Option Filename Type Description ------------------------------------------------------------ -s md.tpr Input Run input file: tpr tpb tpa -o md.trr Output Full precision trajectory: trr trj cpt -x md.xtc Output, Opt! Compressed trajectory (portable xdr format) -cpi state.cpt Input, Opt. Checkpoint file -cpo state.cpt Output, Opt. Checkpoint file -c md.gro Output Structure file: gro g96 pdb -e md.edr Output Energy file: edr ene -g md.log Output Log file -dgdl dgdl.xvg Output, Opt. xvgr/xmgr file -field field.xvg Output, Opt. xvgr/xmgr file -table table.xvg Input, Opt. xvgr/xmgr file -tablep tablep.xvg Input, Opt. xvgr/xmgr file -tableb table.xvg Input, Opt. xvgr/xmgr file -rerun rerun.xtc Input, Opt. Trajectory: xtc trr trj gro g96 pdb cpt -tpi tpi.xvg Output, Opt. xvgr/xmgr file -tpid tpidist.xvg Output, Opt. xvgr/xmgr file -ei sam.edi Input, Opt. ED sampling input -eo sam.edo Output, Opt. ED sampling output -j wham.gct Input, Opt. General coupling stuff -jo bam.gct Output, Opt. General coupling stuff -ffout gct.xvg Output, Opt. xvgr/xmgr file -devout deviatie.xvg Output, Opt. xvgr/xmgr file -runav runaver.xvg Output, Opt. xvgr/xmgr file -px pullx.xvg Output, Opt. xvgr/xmgr file -pf pullf.xvg Output, Opt. xvgr/xmgr file -mtx nm.mtx Output, Opt. Hessian matrix -dn dipole.ndx Output, Opt. Index file Option Type Value Description ------------------------------------------------------ -[no]h bool no Print help info and quit -nice int 0 Set the nicelevel -deffnm string Set the default filename for all file options -[no]xvgr bool yes Add specific codes (legends etc.) in the output xvg files for the xmgrace program -[no]pd bool no Use particle decompostion -dd vector 0 0 0 Domain decomposition grid, 0 is optimize -npme int -1 Number of separate nodes to be used for PME, -1 is guess -ddorder enum interleave DD node order: interleave, pp_pme or cartesian -[no]ddcheck bool yes Check for all bonded interactions with DD -rdd real 0 The maximum distance for bonded interactions with DD (nm), 0 is determine from initial coordinates -rcon real 0 Maximum distance for P-LINCS (nm), 0 is estimate -dlb enum auto Dynamic load balancing (with DD): auto, no or yes -dds real 0.8 Minimum allowed dlb scaling of the DD cell size -[no]sum bool yes Sum the energies at every step -[no]v bool no Be loud and noisy -[no]compact bool yes Write a compact log file -[no]seppot bool no Write separate V and dVdl terms for each interaction type and node to the log file(s) -pforce real -1 Print all forces larger than this (kJ/mol nm) -[no]reprod bool no Try to avoid optimizations that affect binary reproducibility -cpt real 15 Checkpoint interval (minutes) -[no]append bool no Append to previous output files when restarting from checkpoint -maxh real -1 Terminate after 0.99 times this time (hours) -multi int 0 Do multiple simulations in parallel -replex int 0 Attempt replica exchange every # steps -reseed int -1 Seed for replica exchange, -1 is generate a seed -[no]glas bool no Do glass simulation with special long range corrections -[no]ionize bool no Do a simulation including the effect of an X-Ray bombardment on your system Reading file md.tpr, VERSION 4.0 (single precision) NOTE: atoms involved in distance restraints should be within the longest cut-off distance, if this is not the case mdrun generates a fatal error, in that case use particle decomposition (mdrun option -pd) WARNING: Can not write distance restraint data to energy file with domain decomposition ------------------------------------------------------- Program mdrunmpi, VERSION 4.0.2 Source code file: domdec.c, line: 5842 Fatal error: There is no domain decomposition for 4 nodes that is compatible with the given box and a minimum cell size of 3.03524 nm Change the number of nodes or mdrun option -rdd or -dds Look in the log file for details on the domain decomposition ------------------------------------------------------- "It's Bicycle Repair Man !" (Monty Python) Error on node 0, will try to stop all the nodes Halting parallel program mdrunmpi on CPU 0 out of 4 ------------------------------------------------------- Program mdrunmpi, VERSION 4.0.2 Source code file: domdec.c, line: 5860 Fatal error: The size of the domain decomposition grid (0) does not match the number of nodes (4). The total number of nodes is 4 ------------------------------------------------------- "It's Bicycle Repair Man !" (Monty Python) Error on node 1, will try to stop all the nodes Halting parallel program mdrunmpi on CPU 1 out of 4 gcq#205: "It's Bicycle Repair Man !" (Monty Python) [cli_0]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, -1) - process 0 gcq#205: "It's Bicycle Repair Man !" (Monty Python) [cli_1]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, -1) - process 1 rank 1 in job 66 chong06.chem.pitt.edu_35438 caused collective abort of all ranks exit status of rank 1: killed by signal 9 rank 0 in job 66 chong06.chem.pitt.edu_35438 caused collective abort of all ranks exit status of rank 0: killed by signal 9 Can anybody explain to me what could be the problem? I was thinking there is a possibility that there might be another bug. Thanks in advance! Maria _______________________________________________ gmx-users mailing list gmx-users@gromacs.org http://www.gromacs.org/mailman/listinfo/gmx-users Please search the archive at http://www.gromacs.org/search before posting! Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org. Can't post? Read http://www.gromacs.org/mailing_lists/users.php