Now I am able to run simulations on GPU but the output is weird. For example, temperature drops down to 270K while ref_t=298 (Tcoupl=andersen). Moreover, after several hours of simulations mdrun-gpu starts to output "NAN" energies and hangs up. Pre-run and post-run GPU memory test is always passed. The graphics card is that provided with HP desktops (might be MSI) NVIDIA GTX260 with 1.8Gb memory. The output of mdrun and mdrun-gpu versions of Gromacs is given bellow. Any ideas? Thanks.

Igor

////////////////////////////////////////////////////////////////////////////////////////////////////
Log file opened on Fri Oct  8 14:46:51 2010
Host: powerpc  pid: 32083  nodeid: 0  nnodes:  4
The Gromacs distribution was built Thu Sep 30 14:42:48 PDT 2010 by
leont...@powerpc (Linux 2.6.32-22-generic x86_64)


                        :-)  G  R  O  M  A  C  S  (-:

              Gromacs Runs One Microsecond At Cannonball Speeds

                           :-)  VERSION 4.5.1  (-:

       Written by Emile Apol, Rossen Apostolov, Herman J.C. Berendsen,
     Aldert van Buuren, Pär Bjelkmar, Rudi van Drunen, Anton Feenstra,
       Gerrit Groenhof, Peter Kasson, Per Larsson, Peiter Meulenhoff,
         Teemu Murtola, Szilard Pall, Sander Pronk, Roland Schultz,
               Michael Shirts, Alfons Sijbers, Peter Tieleman,

              Berk Hess, David van der Spoel, and Erik Lindahl.

      Copyright (c) 1991-2000, University of Groningen, The Netherlands.
           Copyright (c) 2001-2010, The GROMACS development team at
       Uppsala University & The Royal Institute of Technology, Sweden.
           check out http://www.gromacs.org for more information.

        This program is free software; you can redistribute it and/or
         modify it under the terms of the GNU General Public License
        as published by the Free Software Foundation; either version 2
            of the License, or (at your option) any later version.

     :-)  /usr/local/opt/bin/gromacs/gromacs-4.5.1/bin/mdrun_mpich2  (-:



Input Parameters:
  integrator           = md
  nsteps               = 10000
  init_step            = 0
  ns_type              = Grid
  nstlist              = 10
  ndelta               = 2
  nstcomm              = 1003
  comm_mode            = Linear
  nstlog               = 1000
  nstxout              = 5000
  nstvout              = 10000000
  nstfout              = 0
  nstcalcenergy        = 10
  nstenergy            = 1000
  nstxtcout            = 0
  init_t               = 0
  delta_t              = 0.001
  xtcprec              = 1000
  nkx                  = 54
  nky                  = 60
  nkz                  = 90
  pme_order            = 6
  ewald_rtol           = 1e-05
  ewald_geometry       = 0
  epsilon_surface      = 0
  optimize_fft         = TRUE
  ePBC                 = xyz
  bPeriodicMols        = FALSE
  bContinuation        = FALSE
  bShakeSOR            = FALSE
  etc                  = Andersen
  nsttcouple           = 10
  epc                  = Berendsen
  epctype              = Isotropic
  nstpcouple           = 10
  tau_p                = 0.5
  ref_p (3x3):
     ref_p[    0]={ 1.01325e+00,  0.00000e+00,  0.00000e+00}
     ref_p[    1]={ 0.00000e+00,  1.01325e+00,  0.00000e+00}
     ref_p[    2]={ 0.00000e+00,  0.00000e+00,  1.01325e+00}
  compress (3x3):
     compress[    0]={ 4.50000e-05,  0.00000e+00,  0.00000e+00}
     compress[    1]={ 0.00000e+00,  4.50000e-05,  0.00000e+00}
     compress[    2]={ 0.00000e+00,  0.00000e+00,  4.50000e-05}
  refcoord_scaling     = No
  posres_com (3):
     posres_com[0]= 0.00000e+00
     posres_com[1]= 0.00000e+00
     posres_com[2]= 0.00000e+00
  posres_comB (3):
     posres_comB[0]= 0.00000e+00
     posres_comB[1]= 0.00000e+00
     posres_comB[2]= 0.00000e+00
  andersen_seed        = 815131
  rlist                = 1.2
  rlistlong            = 1.2
  rtpi                 = 0.05
  coulombtype          = PME
  rcoulomb_switch      = 0
  rcoulomb             = 1.2
  vdwtype              = Cut-off
  rvdw_switch          = 0
  rvdw                 = 1.2
  epsilon_r            = 1
  epsilon_rf           = 1
  tabext               = 1
  implicit_solvent     = No
  gb_algorithm         = Still
  gb_epsilon_solvent   = 80
  nstgbradii           = 1
  rgbradii             = 1
  gb_saltconc          = 0
  gb_obc_alpha         = 1
  gb_obc_beta          = 0.8
  gb_obc_gamma         = 4.85
  gb_dielectric_offset = 0.009
  sa_algorithm         = No
  sa_surface_tension   = 2.092
  DispCorr             = EnerPres
  free_energy          = no
  init_lambda          = 0
  delta_lambda         = 0
  n_foreign_lambda     = 0
  sc_alpha             = 0
  sc_power             = 0
  sc_sigma             = 0.3
  sc_sigma_min         = 0.3
  nstdhdl              = 10
  separate_dhdl_file   = yes
  dhdl_derivatives     = yes
  dh_hist_size         = 0
  dh_hist_spacing      = 0.1
  nwall                = 0
  wall_type            = 9-3
  wall_atomtype[0]     = -1
  wall_atomtype[1]     = -1
  wall_density[0]      = 0
  wall_density[1]      = 0
  wall_ewald_zfac      = 3
  pull                 = no
  disre                = No
  disre_weighting      = Conservative
  disre_mixed          = FALSE
  dr_fc                = 1000
  dr_tau               = 0
  nstdisreout          = 100
  orires_fc            = 0
  orires_tau           = 0
  nstorireout          = 100
  dihre-fc             = 1000
  em_stepsize          = 0.01
  em_tol               = 10
  niter                = 20
  fc_stepsize          = 0
  nstcgsteep           = 1000
  nbfgscorr            = 10
  ConstAlg             = Lincs
  shake_tol            = 0.0001
  lincs_order          = 8
  lincs_warnangle      = 30
  lincs_iter           = 4
  bd_fric              = 0
  ld_seed              = 1993
  cos_accel            = 0
  deform (3x3):
     deform[    0]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
     deform[    1]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
     deform[    2]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
  userint1             = 0
  userint2             = 0
  userint3             = 0
  userint4             = 0
  userreal1            = 0
  userreal2            = 0
  userreal3            = 0
  userreal4            = 0
grpopts:
  nrdf:       99021
  ref_t:      298.15
  tau_t:         0.3
anneal:          No
ann_npoints:           0
  acc:            0           0           0
nfreeze: Y Y Y N N N
  energygrp_flags[  0]: 0 0
  energygrp_flags[  1]: 0 0
  efield-x:
     n = 0
  efield-xt:
     n = 0
  efield-y:
     n = 0
  efield-yt:
     n = 0
  efield-z:
     n = 0
  efield-zt:
     n = 0
  bQMMM                = FALSE
  QMconstraints        = 0
  QMMMscheme           = 0
  scalefactor          = 1
qm_opts:
  ngQM                 = 0

Initializing Domain Decomposition on 4 nodes
Dynamic load balancing: auto
Will sort the charge groups at every domain (re)decomposition
Initial maximum inter charge-group distances:
   two-body bonded interactions: 0.585 nm, LJ-14, atoms 10901 11433
multi-body bonded interactions: 0.482 nm, Ryckaert-Bell., atoms 11431 11935
Minimum cell size due to bonded interactions: 0.530 nm
Maximum distance for 9 constraints, at 120 deg. angles, all-trans: 0.218 nm
Estimated maximum distance required for P-LINCS: 0.218 nm
Using 0 separate PME nodes
Scaling the initial minimum size with 1/0.8 (option -dds) = 1.25
Optimizing the DD grid for 4 cells with a minimum initial size of 0.663 nm
The maximum allowed number of cells is: X 9 Y 10 Z 16
Domain decomposition grid 1 x 4 x 1, separate PME nodes 0
PME domain decomposition: 1 x 4 x 1
Domain decomposition nodeid 0, coordinates 0 0 0

Table routines are used for coulomb: TRUE
Table routines are used for vdw:     FALSE
Will do PME sum in reciprocal space.

Will do ordinary reciprocal space Ewald sum.
Using a Gaussian width (1/beta) of 0.384195 nm for Ewald
Cut-off's:   NS: 1.2   Coulomb: 1.2   LJ: 1.2
Long Range LJ corr.: <C6> 4.0351e-04
System total charge: -0.000
Generated table with 1100 data points for Ewald.
Tabscale = 500 points/nm
Generated table with 1100 data points for LJ6.
Tabscale = 500 points/nm
Generated table with 1100 data points for LJ12.
Tabscale = 500 points/nm
Generated table with 1100 data points for 1-4 COUL.
Tabscale = 500 points/nm
Generated table with 1100 data points for 1-4 LJ6.
Tabscale = 500 points/nm
Generated table with 1100 data points for 1-4 LJ12.
Tabscale = 500 points/nm

Enabling SPC-like water optimization for 11505 molecules.

Configuring nonbonded kernels...
Configuring standard C nonbonded kernels...
Testing x86_64 SSE2 support... present.


Removing pbc first time

Initializing Parallel LINear Constraint Solver

Linking all bonded interactions to atoms
There are 65716 inter charge-group exclusions,
will use an extra communication step for exclusion forces for PME

The initial number of communication pulses is: Y 1
The initial domain decomposition cell size is: Y 1.77 nm

The maximum allowed distance for charge groups involved in interactions is:
                non-bonded interactions           1.200 nm
(the following are initial values, they could change due to box deformation)
           two-body bonded interactions  (-rdd)   1.200 nm
         multi-body bonded interactions  (-rdd)   1.200 nm
 atoms separated by up to 9 constraints  (-rcon)  1.773 nm

When dynamic load balancing gets turned on, these settings will change to:
The maximum number of communication pulses is: Y 1
The minimum size for domain decomposition cells is 1.200 nm
The requested allowed shrink of DD cells (option -dds) is: 0.80
The allowed shrink of domain decomposition cells is: Y 0.68
The maximum allowed distance for charge groups involved in interactions is:
                non-bonded interactions           1.200 nm
           two-body bonded interactions  (-rdd)   1.200 nm
         multi-body bonded interactions  (-rdd)   1.200 nm
 atoms separated by up to 9 constraints  (-rcon)  1.200 nm


Making 1D domain decomposition grid 1 x 4 x 1, home cell index 0 0 0

Center of mass motion removal mode is Linear
We have the following groups for center of mass motion removal:
 0:  rest
There are: 46503 Atoms
Charge group distribution at step 0: 4533 7043 7334 4581
Grid: 10 x 6 x 17 cells

Constraining the starting coordinates (step 0)

Constraining the coordinates at t0-dt (step 0)
RMS relative constraint deviation after constraining: 7.96e-07
Initial temperature: 297.745 K

Started mdrun on node 0 Fri Oct  8 14:46:51 2010

          Step           Time         Lambda
             0        0.00000        0.00000

  Energies (kJ/mol)
          Bond          Angle    Proper Dih. Ryckaert-Bell.          LJ-14
   9.26629e+03    2.53358e+04    1.36779e+03    2.97600e+04    1.20809e+04
    Coulomb-14        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.
   1.40505e+05    3.83498e+04   -2.30989e+03   -5.95333e+05   -1.96357e+05
     Potential    Kinetic En.   Total Energy    Temperature Pres. DC (bar)
  -5.37334e+05    1.22595e+05   -4.14739e+05    2.97810e+02   -1.67546e+02
Pressure (bar)   Constr. rmsd
   2.67468e+00    1.03652e-06

DD  step 9 load imb.: force 19.9%

At step 10 the performance loss due to force load imbalance is 9.3 %

NOTE: Turning on dynamic load balancing

DD  step 999  vol min/aver 0.777  load imb.: force  0.1%

          Step           Time         Lambda
          1000        1.00000        0.00000

  Energies (kJ/mol)
          Bond          Angle    Proper Dih. Ryckaert-Bell.          LJ-14
   9.29054e+03    2.49530e+04    1.43296e+03    2.96188e+04    1.19777e+04
    Coulomb-14        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.
   1.40496e+05    3.99112e+04   -2.30308e+03   -5.96482e+05   -1.96429e+05
     Potential    Kinetic En.   Total Energy    Temperature Pres. DC (bar)
  -5.37533e+05    1.22974e+05   -4.14560e+05    2.98729e+02   -1.66560e+02
Pressure (bar)   Constr. rmsd
  -1.40877e+02    1.04647e-06

DD  step 1999  vol min/aver 0.773  load imb.: force  0.1%

................................................................................

          Step           Time         Lambda
         10000       10.00000        0.00000

Writing checkpoint, step 10000 at Fri Oct  8 14:58:26 2010


  Energies (kJ/mol)
          Bond          Angle    Proper Dih. Ryckaert-Bell.          LJ-14
   9.00658e+03    2.52059e+04    1.34920e+03    2.95995e+04    1.19606e+04
    Coulomb-14        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.
   1.40474e+05    4.00471e+04   -2.30290e+03   -5.96601e+05   -1.96374e+05
     Potential    Kinetic En.   Total Energy    Temperature Pres. DC (bar)
  -5.37636e+05    1.22577e+05   -4.15059e+05    2.97765e+02   -1.66533e+02
Pressure (bar)   Constr. rmsd
  -5.69272e+01    1.04191e-06

<======  ###############  ==>
<====  A V E R A G E S  ====>
<==  ###############  ======>

Statistics over 10001 steps using 1001 frames

  Energies (kJ/mol)
          Bond          Angle    Proper Dih. Ryckaert-Bell.          LJ-14
   9.11274e+03    2.49545e+04    1.36688e+03    2.96269e+04    1.20386e+04
    Coulomb-14        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.
   1.40680e+05    3.95513e+04   -2.30457e+03   -5.95701e+05   -1.96403e+05
     Potential    Kinetic En.   Total Energy    Temperature Pres. DC (bar)
  -5.37077e+05    1.22248e+05   -4.14829e+05    2.96967e+02   -1.66776e+02
Pressure (bar)   Constr. rmsd
   4.32167e+00    0.00000e+00

         Box-X          Box-Y          Box-Z
   6.01417e+00    7.09874e+00    1.07493e+01

  Total Virial (kJ/mol)
   4.10242e+04   -9.05005e+00   -2.30129e+02
  -1.20791e+01    4.05914e+04    1.71615e+02
  -2.13770e+02    1.99254e+02    4.04540e+04

  Pressure (bar)
  -1.22617e+01   -8.98547e-01    1.93020e+01
  -6.78561e-01    2.15880e+01   -8.68094e+00
   1.81194e+01   -1.06823e+01    3.63870e+00

  Total Dipole (D)
   4.73145e+02   -1.30311e+03   -2.15240e+02

 Epot (kJ/mol)        Coul-SR          LJ-SR        Coul-14          LJ-14
glu242side-glu242side 2.99268e+00 0.00000e+00 -1.85865e+02 1.35027e+00
glu242side-rest   -5.15085e+01   -2.83484e+01    2.08195e+01    4.24614e+00
     rest-rest   -5.95653e+05    3.95797e+04    1.40846e+05    1.20330e+04


M E G A - F L O P S   A C C O U N T I N G

  RF=Reaction-Field  FE=Free Energy  SCFE=Soft-Core/Free Energy
  T=Tabulated        W3=SPC/TIP3p    W4=TIP4p (single or pairs)
  NF=No Forces

Computing:                               M-Number         M-Flops  % Flops
-----------------------------------------------------------------------------
Coul(T)                              10781.556318      452825.365     4.6
Coul(T) [W3]                            70.655819        8831.977     0.1
Coul(T) + LJ                         34247.547832     1883615.131    18.9
Coul(T) + LJ [W3]                     4684.616330      646477.054     6.5
Coul(T) + LJ [W3-W3]                 12244.355656     4677343.861    47.0
Outer nonbonded loop                  2334.588434       23345.884     0.2
1,4 nonbonded interactions             314.111408       28270.027     0.3
Calc Weights                          1395.229509       50228.262     0.5
Spread Q Bspline                    100456.524648      200913.049     2.0
Gather F Bspline                    100456.524648      602739.148     6.1
3D-FFT                              105882.547196      847060.378     8.5
Solve PME                             1490.549040       95395.139     1.0
NS-Pairs                             12131.757207      254766.901     2.6
Reset In Box                            23.514491          70.543     0.0
CG-CoM                                  46.596006         139.788     0.0
Bonds                                   61.886188        3651.285     0.0
Angles                                 219.991997       36958.655     0.4
Propers                                 23.872387        5466.777     0.1
RB-Dihedrals                           253.765374       62680.047     0.6
Virial                                  46.729683         841.134     0.0
Stop-CM                                  0.465030           4.650     0.0
P-Coupling                             465.076503        2790.459     0.0
Calc-Ekin                              465.123006       12558.321     0.1
Lincs                                   62.989618        3779.377     0.0
Lincs-Mat                              569.895960        2279.584     0.0
Constraint-V                           471.185790        3769.486     0.0
Constraint-Vir                          40.852892         980.469     0.0
Settle                                 115.084515       37172.298     0.4
-----------------------------------------------------------------------------
Total                                                 9944955.052   100.0
-----------------------------------------------------------------------------


   D O M A I N   D E C O M P O S I T I O N   S T A T I S T I C S

av. #atoms communicated per step for force:  2 x 31556.3
av. #atoms communicated per step for LINCS:  5 x 512.1

Average load imbalance: 0.5 %
Part of the total run time spent waiting due to load imbalance: 0.3 %
Steps where the load balancing was limited by -rdd, -rcon and/or -dds: Y 0 %


    R E A L   C Y C L E   A N D   T I M E   A C C O U N T I N G

Computing:         Nodes     Number     G-Cycles    Seconds     %
-----------------------------------------------------------------------
Domain decomp.         4       1001       56.954       21.4     0.8
DD comm. load          4       1000        0.206        0.1     0.0
DD comm. bounds        4       1000        2.836        1.1     0.0
Comm. coord.           4      10001       30.480       11.5     0.4
Neighbor search        4       1001      579.978      218.1     7.8
Force                  4      10001     4548.315     1710.0    61.5
Wait + Comm. F         4      10001       17.520        6.6     0.2
PME mesh               4      10001     1897.783      713.5    25.7
Write traj.            4          3        0.668        0.3     0.0
Update                 4      10001       45.142       17.0     0.6
Constraints            4      10001      181.826       68.4     2.5
Comm. energies         4       1011        3.026        1.1     0.0
Rest                   4                  31.895       12.0     0.4
-----------------------------------------------------------------------
Total                  4                7396.630     2780.9   100.0
-----------------------------------------------------------------------
-----------------------------------------------------------------------
PME redist. X/F        4      20002      208.454       78.4     2.8
PME spread/gather      4      20002     1440.827      541.7    19.5
PME 3D-FFT             4      20002      203.508       76.5     2.8
PME solve              4      10001       44.697       16.8     0.6
-----------------------------------------------------------------------

Parallel run - timing based on wallclock.

              NODE (s)   Real (s)      (%)
      Time:    695.218    695.218    100.0
                      11:35
              (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
Performance:    243.800     14.305      1.243     19.310
Finished mdrun on node 0 Fri Oct  8 14:58:27 2010


////////////////////////////////////////////////////////////////////////////////////////////////////
                        :-)  G  R  O  M  A  C  S  (-:

                  Groningen Machine for Chemical Simulation

                  :-)  VERSION 4.5.1-dev-20101006-d3b58  (-:

       Written by Emile Apol, Rossen Apostolov, Herman J.C. Berendsen,
     Aldert van Buuren, Pär Bjelkmar, Rudi van Drunen, Anton Feenstra,
       Gerrit Groenhof, Peter Kasson, Per Larsson, Pieter Meulenhoff,
         Teemu Murtola, Szilard Pall, Sander Pronk, Roland Schultz,
               Michael Shirts, Alfons Sijbers, Peter Tieleman,

              Berk Hess, David van der Spoel, and Erik Lindahl.

      Copyright (c) 1991-2000, University of Groningen, The Netherlands.
           Copyright (c) 2001-2010, The GROMACS development team at
       Uppsala University & The Royal Institute of Technology, Sweden.
           check out http://www.gromacs.org for more information.

        This program is free software; you can redistribute it and/or
         modify it under the terms of the GNU General Public License
        as published by the Free Software Foundation; either version 2
            of the License, or (at your option) any later version.

:-) /home/leontyev/programs/bin/gromacs/gromacs-4.5.1-gpu/bin/mdrun-gpu (-:

Input Parameters:
  integrator           = md
  nsteps               = 10000
  init_step            = 0
  ns_type              = Grid
  nstlist              = 10
  ndelta               = 2
  nstcomm              = 1003
  comm_mode            = Linear
  nstlog               = 1000
  nstxout              = 5000
  nstvout              = 10000000
  nstfout              = 0
  nstcalcenergy        = 10
  nstenergy            = 1000
  nstxtcout            = 0
  init_t               = 0
  delta_t              = 0.001
  xtcprec              = 1000
  nkx                  = 54
  nky                  = 60
  nkz                  = 90
  pme_order            = 6
  ewald_rtol           = 1e-05
  ewald_geometry       = 0
  epsilon_surface      = 0
  optimize_fft         = TRUE
  ePBC                 = xyz
  bPeriodicMols        = FALSE
  bContinuation        = FALSE
  bShakeSOR            = FALSE
  etc                  = Andersen
  nsttcouple           = 10
  epc                  = Berendsen
  epctype              = Isotropic
  nstpcouple           = 10
  tau_p                = 0.5
  ref_p (3x3):
     ref_p[    0]={ 1.01325e+00,  0.00000e+00,  0.00000e+00}
     ref_p[    1]={ 0.00000e+00,  1.01325e+00,  0.00000e+00}
     ref_p[    2]={ 0.00000e+00,  0.00000e+00,  1.01325e+00}
  compress (3x3):
     compress[    0]={ 4.50000e-05,  0.00000e+00,  0.00000e+00}
     compress[    1]={ 0.00000e+00,  4.50000e-05,  0.00000e+00}
     compress[    2]={ 0.00000e+00,  0.00000e+00,  4.50000e-05}
  refcoord_scaling     = No
  posres_com (3):
     posres_com[0]= 0.00000e+00
     posres_com[1]= 0.00000e+00
     posres_com[2]= 0.00000e+00
  posres_comB (3):
     posres_comB[0]= 0.00000e+00
     posres_comB[1]= 0.00000e+00
     posres_comB[2]= 0.00000e+00
  andersen_seed        = 815131
  rlist                = 1.2
  rlistlong            = 1.2
  rtpi                 = 0.05
  coulombtype          = PME
  rcoulomb_switch      = 0
  rcoulomb             = 1.2
  vdwtype              = Cut-off
  rvdw_switch          = 0
  rvdw                 = 1.2
  epsilon_r            = 1
  epsilon_rf           = 1
  tabext               = 1
  implicit_solvent     = No
  gb_algorithm         = Still
  gb_epsilon_solvent   = 80
  nstgbradii           = 1
  rgbradii             = 1
  gb_saltconc          = 0
  gb_obc_alpha         = 1
  gb_obc_beta          = 0.8
  gb_obc_gamma         = 4.85
  gb_dielectric_offset = 0.009
  sa_algorithm         = Ace-approximation
  sa_surface_tension   = 2.092
  DispCorr             = EnerPres
  free_energy          = no
  init_lambda          = 0
  delta_lambda         = 0
  n_foreign_lambda     = 0
  sc_alpha             = 0
  sc_power             = 0
  sc_sigma             = 0.3
  sc_sigma_min         = 0.3
  nstdhdl              = 10
  separate_dhdl_file   = yes
  dhdl_derivatives     = yes
  dh_hist_size         = 0
  dh_hist_spacing      = 0.1
  nwall                = 0
  wall_type            = 9-3
  wall_atomtype[0]     = -1
  wall_atomtype[1]     = -1
  wall_density[0]      = 0
  wall_density[1]      = 0
  wall_ewald_zfac      = 3
  pull                 = no
  disre                = No
  disre_weighting      = Conservative
  disre_mixed          = FALSE
  dr_fc                = 1000
  dr_tau               = 0
  nstdisreout          = 100
  orires_fc            = 0
  orires_tau           = 0
  nstorireout          = 100
  dihre-fc             = 1000
  em_stepsize          = 0.01
  em_tol               = 10
  niter                = 20
  fc_stepsize          = 0
  nstcgsteep           = 1000
  nbfgscorr            = 10
  ConstAlg             = Lincs
  shake_tol            = 0.0001
  lincs_order          = 8
  lincs_warnangle      = 30
  lincs_iter           = 4
  bd_fric              = 0
  ld_seed              = 1993
  cos_accel            = 0
  deform (3x3):
     deform[    0]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
     deform[    1]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
     deform[    2]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
  userint1             = 0
  userint2             = 0
  userint3             = 0
  userint4             = 0
  userreal1            = 0
  userreal2            = 0
  userreal3            = 0
  userreal4            = 0
grpopts:
  nrdf:       99021
  ref_t:      298.15
  tau_t:         0.3
anneal:          No
ann_npoints:           0
  acc:            0           0           0
nfreeze: Y Y Y N N N
  energygrp_flags[  0]: 0 0
  energygrp_flags[  1]: 0 0
  efield-x:
     n = 0
  efield-xt:
     n = 0
  efield-y:
     n = 0
  efield-yt:
     n = 0
  efield-z:
     n = 0
  efield-zt:
     n = 0
  bQMMM                = FALSE
  QMconstraints        = 0
  QMMMscheme           = 0
  scalefactor          = 1
qm_opts:
  ngQM                 = 0
Table routines are used for coulomb: TRUE
Table routines are used for vdw:     FALSE
Will do PME sum in reciprocal space.

Will do ordinary reciprocal space Ewald sum.
Using a Gaussian width (1/beta) of 0.384195 nm for Ewald
Cut-off's:   NS: 1.2   Coulomb: 1.2   LJ: 1.2
Long Range LJ corr.: <C6> 4.0351e-04
System total charge: -0.000
Generated table with 1100 data points for Ewald.
Tabscale = 500 points/nm
Generated table with 1100 data points for LJ6.
Tabscale = 500 points/nm
Generated table with 1100 data points for LJ12.
Tabscale = 500 points/nm
Generated table with 1100 data points for 1-4 COUL.
Tabscale = 500 points/nm
Generated table with 1100 data points for 1-4 LJ6.
Tabscale = 500 points/nm
Generated table with 1100 data points for 1-4 LJ12.
Tabscale = 500 points/nm

Enabling SPC-like water optimization for 11505 molecules.

Configuring nonbonded kernels...
Configuring standard C nonbonded kernels...


Removing pbc first time

Initializing LINear Constraint Solver

Center of mass motion removal mode is Linear
We have the following groups for center of mass motion removal:
 0:  rest
Max number of connections per atom is 91
Total number of connections is 387700
Max number of graph edges per atom is 6
Total number of graph edges is 70330

OpenMM plugins loaded from directory /home/leontyev/programs/bin/gromacs/OpenMM2.0-Linux64/lib/plugins: libOpenMMCuda.so, libOpenMMOpenCL.so,
The combination rule of the used force field matches the one used by OpenMM.
Gromacs will use the OpenMM platform: Cuda
Gromacs will run on the GPU #0 (GeForce GTX 260).
Pre-simulation ~15s memtest in progress...
Memory test completed without errors.

Constraining the starting coordinates (step 0)

Constraining the coordinates at t0-dt (step 0)
Initial temperature: 0 K

Started mdrun on node 0 Fri Oct  8 16:54:04 2010

          Step           Time         Lambda
             0        0.00000        0.00000

  Energies (kJ/mol)
     Potential    Kinetic En.   Total Energy    Temperature   Constr. rmsd
  -5.34934e+05    1.22629e+05   -4.12305e+05    2.97883e+02    1.03777e-06

          Step           Time         Lambda
          1000        1.00000        0.00000

  Energies (kJ/mol)
     Potential    Kinetic En.   Total Energy    Temperature   Constr. rmsd
  -5.42963e+05    1.16609e+05   -4.26354e+05    2.83260e+02    1.03777e-06

          Step           Time         Lambda
          2000        2.00000        0.00000

  Energies (kJ/mol)
     Potential    Kinetic En.   Total Energy    Temperature   Constr. rmsd
  -5.49782e+05    1.14408e+05   -4.35374e+05    2.77912e+02    1.03777e-06

          Step           Time         Lambda
          3000        3.00000        0.00000

  Energies (kJ/mol)
     Potential    Kinetic En.   Total Energy    Temperature   Constr. rmsd
  -5.51337e+05    1.12705e+05   -4.38631e+05    2.73777e+02    1.03777e-06

          Step           Time         Lambda
          4000        4.00000        0.00000

  Energies (kJ/mol)
     Potential    Kinetic En.   Total Energy    Temperature   Constr. rmsd
  -5.52340e+05    1.12827e+05   -4.39513e+05    2.74073e+02    1.03777e-06

          Step           Time         Lambda
          5000        5.00000        0.00000

  Energies (kJ/mol)
     Potential    Kinetic En.   Total Energy    Temperature   Constr. rmsd
  -5.52599e+05    1.13543e+05   -4.39056e+05    2.75812e+02    1.03777e-06

          Step           Time         Lambda
          6000        6.00000        0.00000

  Energies (kJ/mol)
     Potential    Kinetic En.   Total Energy    Temperature   Constr. rmsd
  -5.52946e+05    1.14271e+05   -4.38675e+05    2.77580e+02    1.03777e-06

          Step           Time         Lambda
          7000        7.00000        0.00000

  Energies (kJ/mol)
     Potential    Kinetic En.   Total Energy    Temperature   Constr. rmsd
  -5.51992e+05    1.13521e+05   -4.38471e+05    2.75759e+02    1.03777e-06

          Step           Time         Lambda
          8000        8.00000        0.00000

  Energies (kJ/mol)
     Potential    Kinetic En.   Total Energy    Temperature   Constr. rmsd
  -5.52834e+05    1.14111e+05   -4.38723e+05    2.77192e+02    1.03777e-06

          Step           Time         Lambda
          9000        9.00000        0.00000

  Energies (kJ/mol)
     Potential    Kinetic En.   Total Energy    Temperature   Constr. rmsd
  -5.52806e+05    1.13783e+05   -4.39022e+05    2.76396e+02    1.03777e-06

          Step           Time         Lambda
         10000       10.00000        0.00000

  Energies (kJ/mol)
     Potential    Kinetic En.   Total Energy    Temperature   Constr. rmsd
  -5.53230e+05    1.12594e+05   -4.40636e+05    2.73506e+02    1.03777e-06

Writing checkpoint, step 10000 at Fri Oct  8 17:06:02 2010


<======  ###############  ==>
<====  A V E R A G E S  ====>
<==  ###############  ======>

Statistics over 11 steps using 11 frames

  Energies (kJ/mol)
     Potential    Kinetic En.   Total Energy    Temperature   Constr. rmsd
  -5.49797e+05    1.14636e+05   -4.35160e+05    2.78468e+02    0.00000e+00

         Box-X          Box-Y          Box-Z
   1.73572e+12    1.19301e-40    2.31720e+11

  Total Virial (kJ/mol)
   0.00000e+00    0.00000e+00    0.00000e+00
   0.00000e+00    0.00000e+00    0.00000e+00
   0.00000e+00    0.00000e+00    0.00000e+00

  Pressure (bar)
   0.00000e+00    0.00000e+00    0.00000e+00
   0.00000e+00    0.00000e+00    0.00000e+00
   0.00000e+00    0.00000e+00    0.00000e+00

  Total Dipole (D)
   0.00000e+00    0.00000e+00    0.00000e+00

 Epot (kJ/mol)        Coul-SR          LJ-SR        Coul-14          LJ-14
glu242side-glu242side 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00
glu242side-rest    0.00000e+00    0.00000e+00    0.00000e+00    0.00000e+00
     rest-rest    0.00000e+00    0.00000e+00    0.00000e+00    0.00000e+00

Post-simulation ~15s memtest in progress...
Memory test completed without errors.

M E G A - F L O P S   A C C O U N T I N G

  RF=Reaction-Field  FE=Free Energy  SCFE=Soft-Core/Free Energy
  T=Tabulated        W3=SPC/TIP3p    W4=TIP4p (single or pairs)
  NF=No Forces

Computing:                               M-Number         M-Flops  % Flops
-----------------------------------------------------------------------------
Lincs                                    0.011934           0.716     8.0
Lincs-Mat                                0.106200           0.425     4.7
Constraint-V                             0.046449           0.372     4.2
Settle                                   0.023010           7.432    83.1
-----------------------------------------------------------------------------
Total                                                       8.945   100.0
-----------------------------------------------------------------------------


    R E A L   C Y C L E   A N D   T I M E   A C C O U N T I N G

Computing:         Nodes     Number     G-Cycles    Seconds     %
-----------------------------------------------------------------------
Write traj.            1         11        3.033        1.1     0.2
Rest                   1                1978.521      716.9    99.8
-----------------------------------------------------------------------
Total                  1                1981.554      718.0   100.0
-----------------------------------------------------------------------

OpenMM run - timing based on wallclock.

              NODE (s)   Real (s)      (%)
      Time:    717.970    717.970    100.0
                      11:57
              (Mnbf/s)   (MFlops)   (ns/day)  (hour/ns)
Performance:      0.000      0.012      1.204     19.942
Finished mdrun on node 0 Fri Oct  8 17:06:02 2010
////////////////////////////////////////////////////////////////////////////////////////////////////

Igor Leontyev wrote:

Finally, I compiled and ran simulations with gpu version of gromacs-4.5.1.
There were several issues:

1) Precompiled OpenMM2.0 libraries and headers must be downloaded (which
requires registration on their web page) and installed, otherwise cmake
doesn't find some source files.

2) cmake should be called outside the original source directory with the
path of the directory as an argument.

3) To run the obtained mdrun-gpu binary the CUDA dev driver should be
installed, otherwise the program does not find 'CUDA'. This step appeared to
be the most problematic for me. According to OpenMM manual the driver must
be installed with turned off x-windows service which can be done by the
command "init 3". In Ubuntu this command has no effect, while switching the
graphical interface off/on is done by

"sudo service gdm stop/start"

It turned out that in Ubuntu-10.04 the CUDA driver installation script does
not work properly even with turned off gdm. This issue and its solution is
described at http://ubuntuforums.org/showthread.php?t=1467074

Thank you for comments,

Igor


Szilárd Páll wrote:
Dear Igor,

Your output look _very_ weird, it seems as if CMake internal
variable(s) were not initialized, which I have no clue how could have
happened - the build generator works just fine for me. The only thing
I can think of is that maybe your CMakeCache is corrupted.

Could you please rerun cmake in a _clean_ build directory? Also, are
you able to run cmake for CPU build (no -D options)?

--
Szilárd

Szilárd wrote:

The beta versions are all outdated, could you please use the latest
source distribution (4.5.1) instead (or git from the
release-4-5-patches branch)?

The result is the same for both the distribution 4.5.1 and git from the
release-4-5-patches. See the output bellow.
=========================================

PATH=/usr/local/opt/bin/mpi/openmpi-1.4.2/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games
LD_LIBRARY_PATH=/usr/local/opt/bin/mpi/openmpi-1.4.2/lib:/home/leontyev/programs/bin/cuda/lib64:
CPPFLAGS=-I//usr/local/opt/bin/gromacs/fftw-3.2.2/single_sse/include
-I//usr/local/opt/bin/mpi/openmpi-1.4.2/include
LDFLAGS=-L//usr/local/opt/bin/gromacs/fftw-3.2.2/single_sse/lib
-L//usr/local/opt/bin/mpi/openmpi-1.4.2/lib
OPENMM_ROOT_DIR=/home/leontyev/programs/bin/gromacs/gromacs-4.5.1-git/openmm

cmake src -DGMX_OPENMM=ON -DGMX_THREADS=OFF
-DCMAKE_INSTALL_PREFIX=/home/leontyev/programs/bin/gromacs/gromacs-4.5.1-git
CMake Error at gmxlib/CMakeLists.txt:124 (set_target_properties):
set_target_properties called with incorrect number of arguments.


CMake Error at gmxlib/CMakeLists.txt:126 (install):
install TARGETS given no ARCHIVE DESTINATION for static library target
"gmx".


CMake Error at mdlib/CMakeLists.txt:11 (set_target_properties):
set_target_properties called with incorrect number of arguments.


CMake Error at mdlib/CMakeLists.txt:13 (install):
install TARGETS given no ARCHIVE DESTINATION for static library target
"md".


CMake Error at kernel/CMakeLists.txt:43 (set_target_properties):
set_target_properties called with incorrect number of arguments.


CMake Error at kernel/CMakeLists.txt:44 (set_target_properties):
set_target_properties called with incorrect number of arguments.


CMake Error at kernel/gmx_gpu_utils/CMakeLists.txt:18
(CUDA_INCLUDE_DIRECTORIES):
Unknown CMake command "CUDA_INCLUDE_DIRECTORIES".


CMake Warning (dev) in CMakeLists.txt:
No cmake_minimum_required command is present. A line of code such as

cmake_minimum_required(VERSION 2.8)

should be added at the top of the file. The version specified may be
lower
if you wish to support older CMake versions for this project. For more
information run "cmake --help-policy CMP0000".
This warning is for project developers. Use -Wno-dev to suppress it.

-- Configuring incomplete, errors occurred!


--
gmx-users mailing list    gmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org.
Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

Reply via email to