Hello,
This is the output from deviceQuery command:
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
There are 4 devices supporting CUDA
Device 0: "Tesla T10 Processor"
CUDA Driver Version: 3.20
CUDA Runtime Version: 3.20
CUDA Capability Major revision number: 1
CUDA Capability Minor revision number: 3
Total amount of global memory: 4294770688 bytes
Number of multiprocessors: 30
Number of cores: 240
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 16384
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 2147483647 bytes
Texture alignment: 256 bytes
Clock rate: 1.44 GHz
Concurrent copy and execution: Yes
Run time limit on kernels: No
Integrated: No
Support host page-locked memory mapping: Yes
Compute mode: Default (multiple
host threads can use this device simultaneously)
Device 1: "Tesla T10 Processor"
CUDA Driver Version: 3.20
CUDA Runtime Version: 3.20
CUDA Capability Major revision number: 1
CUDA Capability Minor revision number: 3
Total amount of global memory: 4294770688 bytes
Number of multiprocessors: 30
Number of cores: 240
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 16384
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 2147483647 bytes
Texture alignment: 256 bytes
Clock rate: 1.44 GHz
Concurrent copy and execution: Yes
Run time limit on kernels: No
Integrated: No
Support host page-locked memory mapping: Yes
Compute mode: Default (multiple
host threads can use this device simultaneously)
Device 2: "Tesla T10 Processor"
CUDA Driver Version: 3.20
CUDA Runtime Version: 3.20
CUDA Capability Major revision number: 1
CUDA Capability Minor revision number: 3
Total amount of global memory: 4294770688 bytes
Number of multiprocessors: 30
Number of cores: 240
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 16384
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 2147483647 bytes
Texture alignment: 256 bytes
Clock rate: 1.44 GHz
Concurrent copy and execution: Yes
Run time limit on kernels: No
Integrated: No
Support host page-locked memory mapping: Yes
Compute mode: Default (multiple
host threads can use this device simultaneously)
Device 3: "Tesla T10 Processor"
CUDA Driver Version: 3.20
CUDA Runtime Version: 3.20
CUDA Capability Major revision number: 1
CUDA Capability Minor revision number: 3
Total amount of global memory: 4294770688 bytes
Number of multiprocessors: 30
Number of cores: 240
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 16384
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 2147483647 bytes
Texture alignment: 256 bytes
Clock rate: 1.44 GHz
Concurrent copy and execution: Yes
Run time limit on kernels: No
Integrated: No
Support host page-locked memory mapping: Yes
Compute mode: Default (multiple
host threads can use this device simultaneously)
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 4243455, CUDA
Runtime Version = 3.20, NumDevs = 4, Device = Tesla T10 Processor,
Device = Tesla T10 Processor
PASSED
----------------------------------------------------------------------------
And this simulation was already done by cpu first and I tried to run
the second one with gpu.
Thanks.
Bongkeun Kim
Quoting Szilard Pall <szilard.p...@cbr.su.se>:
Hi,
Tesla C1060 and S1070 should is definitely supported so it's strange
that you get that warning. The only thing I can think of is that for
some reason the CUDA runtime reports the name of the GPUS other than
C1060/S1070. Could you please run the deviceQuery from the SDK and
provide the output here?
However, that should not be causing the NaN issue. Does the same
simulation run on the CPU?
Cheers,
--
Szilard
2010/12/15 Bongkeun Kim <b...@chem.ucsb.edu>:
Hello,
I tried using 1fs timestep and it did not work.
I'm using nvidia T10 gpus(c1060 or s1070) and mdrun-gpu said it's not
supported gpu and I had to use "force-device=y". Do you think this is the
reason of the error?
Thanks.
Bongkeun Kim
Quoting Emanuel Peter <emanuel.pe...@chemie.uni-regensburg.de>:
Hello,
If you use for your timestep 1fs instead of 2fs, it could run better.
Bests,
Emanuel
Bongkeun Kim 15.12.10 8.36 Uhr >>>
Hello,
I got an error log when I used gromacs-gpu on npt simulation.
The error is like:
---------------------------------------------------------------
Input Parameters:
integrator = md
nsteps = 50000000
init_step = 0
ns_type = Grid
nstlist = 5
ndelta = 2
nstcomm = 10
comm_mode = Linear
nstlog = 1000
nstxout = 1000
nstvout = 1000
nstfout = 0
nstcalcenergy = 5
nstenergy = 1000
nstxtcout = 1000
init_t = 0
delta_t = 0.002
xtcprec = 1000
nkx = 32
nky = 32
nkz = 32
pme_order = 4
ewald_rtol = 1e-05
ewald_geometry = 0
epsilon_surface = 0
optimize_fft = FALSE
ePBC = xyz
bPeriodicMols = FALSE
bContinuation = TRUE
bShakeSOR = FALSE
etc = V-rescale
nsttcouple = 5
epc = Parrinello-Rahman
epctype = Isotropic
nstpcouple = 5
tau_p = 2
ref_p (3x3):
ref_p[ 0]={ 1.00000e+00, 0.00000e+00, 0.00000e+00}
ref_p[ 1]={ 0.00000e+00, 1.00000e+00, 0.00000e+00}
ref_p[ 2]={ 0.00000e+00, 0.00000e+00, 1.00000e+00}
compress (3x3):
compress[ 0]={ 4.50000e-05, 0.00000e+00, 0.00000e+00}
compress[ 1]={ 0.00000e+00, 4.50000e-05, 0.00000e+00}
compress[ 2]={ 0.00000e+00, 0.00000e+00, 4.50000e-05}
refcoord_scaling = No
posres_com (3):
posres_com[0]= 0.00000e+00
posres_com[1]= 0.00000e+00
posres_com[2]= 0.00000e+00
posres_comB (3):
posres_comB[0]= 0.00000e+00
posres_comB[1]= 0.00000e+00
posres_comB[2]= 0.00000e+00
andersen_seed = 815131
rlist = 1
rlistlong = 1
rtpi = 0.05
coulombtype = PME
rcoulomb_switch = 0
rcoulomb = 1
vdwtype = Cut-off
rvdw_switch = 0
rvdw = 1
epsilon_r = 1
epsilon_rf = 1
tabext = 1
implicit_solvent = No
gb_algorithm = Still
gb_epsilon_solvent = 80
nstgbradii = 1
rgbradii = 1
gb_saltconc = 0
gb_obc_alpha = 1
gb_obc_beta = 0.8
gb_obc_gamma = 4.85
gb_dielectric_offset = 0.009
sa_algorithm = Ace-approximation
sa_surface_tension = 2.05016
DispCorr = EnerPres
free_energy = no
init_lambda = 0
delta_lambda = 0
n_foreign_lambda = 0
sc_alpha = 0
sc_power = 0
sc_sigma = 0.3
sc_sigma_min = 0.3
nstdhdl = 10
separate_dhdl_file = yes
dhdl_derivatives = yes
dh_hist_size = 0
dh_hist_spacing = 0.1
nwall = 0
wall_type = 9-3
wall_atomtype[0] = -1
wall_atomtype[1] = -1
wall_density[0] = 0
wall_density[1] = 0
wall_ewald_zfac = 3
pull = no
disre = No
disre_weighting = Conservative
disre_mixed = FALSE
dr_fc = 1000
dr_tau = 0
nstdisreout = 100
orires_fc = 0
orires_tau = 0
nstorireout = 100
dihre-fc = 1000
em_stepsize = 0.01
em_tol = 10
niter = 20
fc_stepsize = 0
nstcgsteep = 1000
nbfgscorr = 10
ConstAlg = Lincs
shake_tol = 0.0001
lincs_order = 4
lincs_warnangle = 30
lincs_iter = 1
bd_fric = 0
ld_seed = 1993
cos_accel = 0
deform (3x3):
deform[ 0]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
deform[ 1]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
deform[ 2]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
userint1 = 0
userint2 = 0
userint3 = 0
userint4 = 0
userreal1 = 0
userreal2 = 0
userreal3 = 0
userreal4 = 0
grpopts:
nrdf: 24715
ref_t: 325
tau_t: 0.1
anneal: No
ann_npoints: 0
acc: 0 0 0
nfreeze: N N N
energygrp_flags[ 0]: 0
efield-x:
n = 0
efield-xt:
n = 0
efield-y:
n = 0
efield-yt:
n = 0
efield-z:
n = 0
efield-zt:
n = 0
bQMMM = FALSE
QMconstraints = 0
QMMMscheme = 0
scalefactor = 1
qm_opts:
ngQM = 0
Table routines are used for coulomb: TRUE
Table routines are used for vdw: FALSE
Will do PME sum in reciprocal space.
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
U. Essman, L. Perela, M. L. Berkowitz, T. Darden, H. Lee and L. G.
Pedersen
A smooth particle mesh Ewald method
J. Chem. Phys. 103 (1995) pp. 8577-8592
-------- -------- --- Thank You --- -------- --------
Will do ordinary reciprocal space Ewald sum.
Using a Gaussian width (1/beta) of 0.320163 nm for Ewald
Cut-off's: NS: 1 Coulomb: 1 LJ: 1
Long Range LJ corr.: 2.9723e-04
System total charge: 0.000
Generated table with 1000 data points for Ewald.
Tabscale = 500 points/nm
Generated table with 1000 data points for LJ6.
Tabscale = 500 points/nm
Generated table with 1000 data points for LJ12.
Tabscale = 500 points/nm
Generated table with 1000 data points for 1-4 COUL.
Tabscale = 500 points/nm
Generated table with 1000 data points for 1-4 LJ6.
Tabscale = 500 points/nm
Generated table with 1000 data points for 1-4 LJ12.
Tabscale = 500 points/nm
Enabling SPC-like water optimization for 3910 molecules.
Configuring nonbonded kernels...
Configuring standard C nonbonded kernels...
Initializing LINear Constraint Solver
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
B. Hess and H. Bekker and H. J. C. Berendsen and J. G. E. M. Fraaije
LINCS: A Linear Constraint Solver for molecular simulations
J. Comp. Chem. 18 (1997) pp. 1463-1472
-------- -------- --- Thank You --- -------- --------
The number of constraints is 626
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
S. Miyamoto and P. A. Kollman
SETTLE: An Analytical Version of the SHAKE and RATTLE Algorithms for Rigid
Water Models
J. Comp. Chem. 13 (1992) pp. 952-962
-------- -------- --- Thank You --- -------- --------
Center of mass motion removal mode is Linear
We have the following groups for center of mass motion removal:
0: rest
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
G. Bussi, D. Donadio and M. Parrinello
Canonical sampling through velocity rescaling
J. Chem. Phys. 126 (2007) pp. 014101
-------- -------- --- Thank You --- -------- --------
Max number of connections per atom is 103
Total number of connections is 37894
Max number of graph edges per atom is 4
Total number of graph edges is 16892
OpenMM plugins loaded from directory
/home/bkim/packages/openmm/lib/plugins:
libOpenMMCuda.so, libOpenMMOpenCL.so,
The combination rule of the used force field matches the one used by
OpenMM.
Gromacs will use the OpenMM platform: Cuda
Non-supported GPU selected (#1, Tesla T10 Processor), forced
continuing.Note, th
at the simulation can be slow or it migth even crash.
Pre-simulation ~15s memtest in progress...
Memory test completed without errors.
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
Entry Friedrichs2009 not found in citation database
-------- -------- --- Thank You --- -------- --------
Initial temperature: 0 K
Started mdrun on node 0 Tue Dec 14 23:10:20 2010
Step Time Lambda
0 0.00000 0.00000
Energies (kJ/mol)
Potential Kinetic En. Total Energy Temperature Constr.
rmsd
-1.40587e+05 3.36048e+04 -1.06982e+05 3.27065e+02
0.00000e+00
Step Time Lambda
1000 2.00000 0.00000
Energies (kJ/mol)
Potential Kinetic En. Total Energy Temperature Constr.
rmsd
nan nan nan nan
0.00000e+00
Received the second INT/TERM signal, stopping at the next step
Step Time Lambda
1927 3.85400 0.00000
Energies (kJ/mol)
Potential Kinetic En. Total Energy Temperature Constr.
rmsd
nan nan nan nan
0.00000e+00
Writing checkpoint, step 1927 at Tue Dec 14 23:12:07 2010
<====== ############### ==>
<==== A V E R A G E S ====>
<== ############### ======>
Statistics over 3 steps using 3 frames
Energies (kJ/mol)
Potential Kinetic En. Total Energy Temperature Constr.
rmsd
nan nan nan nan
0.00000e+00
Box-X Box-Y Box-Z
3.91363e-24 6.72623e-44 -1.71925e+16
Total Virial (kJ/mol)
0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00
Pressure (bar)
0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00
0.00000e+00 0.00000e+00 0.00000e+00
Total Dipole (D)
0.00000e+00 0.00000e+00 0.00000e+00
------------------------------------------------------------------------
The input mdp file is given by
========================================================
title = OPLS Lysozyme MD
; Run parameters
integrator = md ; leap-frog integrator
nsteps = 50000000 ;
dt = 0.002 ; 2 fs
; Output control
nstxout = 1000 ; save coordinates every 2 ps
nstvout = 1000 ; save velocities every 2 ps
nstxtcout = 1000 ; xtc compressed trajectory output every 2
ps
nstenergy = 1000 ; save energies every 2 ps
nstlog = 1000 ; update log file every 2 ps
; Bond parameters
continuation = yes ; Restarting after NPT
constraint_algorithm = lincs ; holonomic constraints
constraints = all-bonds ; all bonds (even heavy atom-H bonds)
constraine
d
lincs_iter = 1 ; accuracy of LINCS
lincs_order = 4 ; also related to accuracy
; Neighborsearching
ns_type = grid ; search neighboring grid cels
nstlist = 5 ; 10 fs
rlist = 1.0 ; short-range neighborlist cutoff (in nm)
rcoulomb = 1.0 ; short-range electrostatic cutoff (in nm)
rvdw = 1.0 ; short-range van der Waals cutoff (in nm)
; Electrostatics
coulombtype = PME ; Particle Mesh Ewald for long-range
electrostat
ics
pme_order = 4 ; cubic interpolation
fourierspacing = 0.16 ; grid spacing for FFT
; Temperature coupling is on
tcoupl = V-rescale ; modified Berendsen thermostat
tc-grps = System ; two coupling groups - more accurate
tau_t = 0.1 ; time constant, in ps
ref_t = 325 ; reference temperature, one for each
group, in
K
; Pressure coupling is on
pcoupl = Parrinello-Rahman ; Pressure coupling on in NPT
pcoupltype = isotropic ; uniform scaling of box vectors
tau_p = 2.0 ; time constant, in ps
ref_p = 1.0 ; reference pressure, in bar
compressibility = 4.5e-5 ; isothermal compressibility of water,
bar^-1
; Periodic boundary conditions
pbc = xyz ; 3-D PBC
; Dispersion correction
DispCorr = EnerPres ; account for cut-off vdW scheme
; Velocity generation
gen_vel = no ; Velocity generation is off
=========================================================================
It worked with generic cpu mdrun but gave this error when mdrun-gpu
was used by
mdrun-gpu -deffnm md_0_2 -device
"OpenMM:platform=Cuda,deviceid=1,force-device=y
es"
If you have any idea how to avoid this problem, I will really appreciate
it.
Thank you.
Bongkeun Kim
--
gmx-users mailing list gmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
--
gmx-users mailing list gmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
Please don't post (un)subscribe requests to the list. Use thewww interface
or send it to gmx-users-requ...@gromacs.org.
Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
--
gmx-users mailing list gmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
--
gmx-users mailing list gmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
Can't post? Read http://www.gromacs.org/Support/Mailing_Lists