Sounds worrying :-( Thanks for the detailed report and trouble-shooting! So far, I can't think of a reason for it.
A couple of suggestions: * try again with 4.6.3 (at least while trouble-shooting) in case its a fixed bug * post a representative .mdp file * is there anything out of the ordinary in the topology? * if the problem is restart-related and shows up in the drift quickly, then you can probably find a reproducible case via a job that does lots of short-interval restarts and saves all the intermediate files - a (set of) inputs that can reproduce the problem sounds like what we'd need to diagnose and/or fix anything * does it happen in a non-multi simulation? (or more particularly, what are you doing with -multi?) * check .log files for warnings, and that there are none being suppressed at the grompp stage * see if the group cut-off scheme in 4.6.x shows the same problem Mark On Mon, Sep 9, 2013 at 4:08 PM, Richard Broadbent <richard.broadben...@imperial.ac.uk> wrote: > Dear All, > > I've been analysing a series of long (200 ns) NVE simulations (md > integrator) on ~93'000 atom systems I ran the simulations in groups of 3 > using the -multi option in gromacs v4.6.1 double precision. > > Simulations were run with 1 OpenMP thread per MPI process > > The simulations were restarted at regular intervals using the following > submission script: > > > FILE=4.6_P84_DIO_ > > module load fftw xe-gromacs/4.6.1 > > # Change to the direcotry that the job was submitted from > cd $PBS_O_WORKDIR > > export NPROC=`qstat -f $PBS_JOBID | grep mppwidth | awk '{print $3}'` > export NTASK=`qstat -f $PBS_JOBID | grep mppnppn | awk '{print $3}'` > > aprun -n $NPROC -N $NTASK mdrun_mpi_d -deffnm $FILE -maxh 24 -multi 3 > -npme 64 -append -cpi > > > > ### > > The first simulation was run with the same script except the mdrun line was > > aprun -n $NPROC -N $NTASK mdrun_mpi_d -deffnm $FILE -maxh 24 -multi 3 > -npme 64 > > ### > > > The simulations generally ran and restarted without trouble, however, in > several simulations the energy drift changed radically following the > restart. > > in one simulation the simulation ran for 50 ns (including one restart) with > a drift of -141.6 +/- 0.1 kJ mol^-1 ns^1 > it was restarted then had a drift of +104 +/- 1 kJ mol^-1 ns^1 for 15 ns > then was restarted and continued with a drift of -138 +/- 0.1 kJ mol^-1 ns^1 > for a further 50~ns. > > The other 2 simulations running in parallel with this calculation through > the -multi option did not experience a change in gradient. > > the drifts were calculated by least squares analysis of the output from the > total energy data given by > > echo "total" | g_energy_d -f ${FILE}${i}.edr -o total_${FILE}${i}.xvg -xvg > none > > > The simulation writes to the edr every 20 ps and the transition is masked by > the expected oscillations in energy due to the integrator on a 2~ns interval > but the change in drift is clear when looking at a 4~ns range centred on the > restart. > > The hardware used was of the same specification for all jobs (27 cray XE6 > nodes (9 nodes per simulation), 32 mpi processes per node) > > The simulations use the verlet cut-off scheme > there are H-bond constraints enforced using lincs (order 6, iterations 2) > > > I can't think what would cause this change in the drift during a restart. > However, I have seen it in simulations run on both an AMD system (cray XE6, > AVX-FMA) and an intel system (SGI-ice, SSE4.1). > > > I have some data generated using the same procedure using v4.5.5 and v4.5.7 > (different cut-off scheme) and the restarts in that system have not caused > any appreciable changes in the simulation. > > Unfortunately I didn't save the checkpoint files used for the restart (I > will in the future). I'm going to try building a new input file from just > before the restart using the trr trajectory data. > > > Does anyone have any ideas of what might have caused this? > > Has anyone seen similar effects? > > Thanks, > > Richard > -- > gmx-users mailing list gmx-users@gromacs.org > http://lists.gromacs.org/mailman/listinfo/gmx-users > * Please search the archive at > http://www.gromacs.org/Support/Mailing_Lists/Search before posting! > * Please don't post (un)subscribe requests to the list. Use the www > interface or send it to gmx-users-requ...@gromacs.org. > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists -- gmx-users mailing list gmx-users@gromacs.org http://lists.gromacs.org/mailman/listinfo/gmx-users * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting! * Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org. * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists