from:"Szilárd Páll"

Re: mdrun on 8-core AMD + GTX TITAN (was: Re: [gmx-users] Re: Gromacs-4.6 on two Titans GPUs)

2013-11-12 Thread Szilárd Páll

As Mark said, please share the *entire* log file. Among other important things, the result of PP-PME tuning is not included above. However, I suspect that in this case scaling is strongly affected or by the small size of the system you are simulating. -- Szilárd On Sun, Nov 10, 2013 at 5:28 AM,

Re: [gmx-users] Re: Gromacs-4.6 on two Titans GPUs

2013-11-07 Thread Szilárd Páll

On Thu, Nov 7, 2013 at 6:34 AM, James Starlight wrote: > I've gone to conclusion that simulation with 1 or 2 GPU simultaneously gave > me the same performance > mdrun -ntmpi 2 -ntomp 6 -gpu_id 01 -v -deffnm md_CaM_test, > > mdrun -ntmpi 2 -ntomp 6 -gpu_id 0 -v -deffnm md_CaM_test, > > Doest it b

mdrun on 8-core AMD + GTX TITAN (was: Re: [gmx-users] Re: Gromacs-4.6 on two Titans GPUs)

2013-11-07 Thread Szilárd Páll

Let's not hijack James' thread as your hardware is different from his. On Tue, Nov 5, 2013 at 11:00 PM, Dwey Kauffman wrote: > Hi Szilard, > >Thanks for your suggestions. I am indeed aware of this page. In a 8-core > AMD with 1GPU, I am very happy about its performance. See below. My Actual

Re: [gmx-users] Re: Hardware for best gromacs performance?

2013-11-05 Thread Szilárd Páll

On Tue, Nov 5, 2013 at 9:55 PM, Dwey Kauffman wrote: > Hi Timo, > > Can you provide a benchmark with "1" Xeon E5-2680 with "1" Nvidia > k20x GPGPU on the same test of 29420 atoms ? > > Are these two GPU cards (within the same node) connected by a SLI (Scalable > Link Interface) ? Note that

Re: [gmx-users] Re: Gromacs-4.6 on two Titans GPUs

2013-11-05 Thread Szilárd Páll

> threads, hence a total of 24 threads however even with hyper threading >>> > enabled there are only 12 threads on your machine. Therefore, only >>> allocate >>> > 12. Try >>> > >>> > mdrun -ntmpi 2 -ntomp 6 -gpu_id 01 -v -deffnm md_CaM_test &g

Re: [gmx-users] Re: Hardware for best gromacs performance?

2013-11-05 Thread Szilárd Páll

Timo, Have you used the default settings, that is one rank/GPU? If that is the case, you may want to try using multiple ranks per GPU, this can often help when you have >4-6 cores/GPU. Separate PME ranks are not switched on by default with GPUs, have you tried using any? Cheers, -- Szilárd P

Re: [gmx-users] Gromacs-4.6 on two Titans GPUs

2013-11-04 Thread Szilárd Páll

You can use the "-march=native" flag with gcc to optimize for the CPU your are building on or e.g. -march=corei7-avx-i for Intel Ivy Bridge CPUs. -- Szilárd Páll On Mon, Nov 4, 2013 at 12:37 PM, James Starlight wrote: > Szilárd, thanks for suggestion! > > What kind of CPU op

Re: [gmx-users] Hardware for best gromacs performance?

2013-11-04 Thread Szilárd Páll

hine configurations before buying. (Note that I have never tried it myself, so I can't provide more details or vouch for it in any way.) Cheers, -- Szilárd Páll On Fri, Nov 1, 2013 at 3:08 AM, David Chalmers wrote: > Hi All, > > I am considering setting up a small cluster to run Gr

Re: [gmx-users] Re: Hardware for best gromacs performance?

2013-11-04 Thread Szilárd Páll

Brad, These numbers seems rather low for a standard simulation setup! Did you use a particularly long cut-off or short time-step? Cheers, -- Szilárd Páll On Fri, Nov 1, 2013 at 6:30 PM, Brad Van Oosten wrote: > Im not sure on the prices of these systems any more, they are getting dated &

Re: [gmx-users] Gromacs-4.6 on two Titans GPUs

2013-11-04 Thread Szilárd Páll

That should be enough. You may want to use the -march (or equivalent) compiler flag for CPU optimization. Cheers, -- Szilárd Páll On Sun, Nov 3, 2013 at 10:01 AM, James Starlight wrote: > Dear Gromacs Users! > > I'd like to compile lattest 4.6 Gromacs with native GPU supporting

Re: [gmx-users] Output pinning for mdrun

2013-10-24 Thread Szilárd Páll

Hi Carsten, On Thu, Oct 24, 2013 at 4:52 PM, Carsten Kutzner wrote: > On Oct 24, 2013, at 4:25 PM, Mark Abraham wrote: > >> Hi, >> >> No. mdrun reports the stride with which it moves over the logical cores >> reported by the OS, setting the affinity of GROMACS threads to logical >> cores, and wa

Re: [gmx-users] Gromacs on Stampede

2013-10-10 Thread Szilárd Páll

e are a few analysis tools that support OpenMP and even with those I/O will be a severe bottleneck if you were considering using the Phi-s for analysis. So for now, I would stick to using only the CPUs in the system. Cheers, -- Szilárd Páll On Thu, Oct 10, 2013 at 12:58 PM, Arun Sharma

Re: [gmx-users] confusion about implicint solvent

2013-09-23 Thread Szilárd Páll

Hi, Admittedly, both the documentation on these features and the communication on the known issues with these aspects of GROMACS has been lacking. Here's a brief summary/explanation: - GROMACS 4.5: implicit solvent simulations possible using mdrun-gpu which is essentially mdrun + OpenMM, hence it

Re: [gmx-users] Cross compiling GROMACS 4.6.3 for native Xeon Phi, thread-mpi problem

2013-09-16 Thread Szilárd Páll

On Mon, Sep 16, 2013 at 7:04 PM, PaulC wrote: > Hi, > > > I'm attempting to build GROMACS 4.6.3 to run entirely within a single Xeon > Phi (i.e. native) with either/both Intel MPI/OpenMP for parallelisation > within the single Xeon Phi. > > I followed these instructions from Intel for cross compil

Re: [gmx-users] Installation of gromacs 4.5.4 on windows using cygwin

2013-09-15 Thread Szilárd Páll

Looks like you are compiling 4.5.1. You should try compiling the latest version in the 4.5 series, 4.5.7. -- Szilárd On Sun, Sep 15, 2013 at 6:39 PM, Muthukumaran R wrote: > hello, > > I am trying to install gromacs in cygwin but after issuing "make", > installation stops with the following erro

Re: [gmx-users] segfault on Gromacs 4.6.3 (cuda)

2013-09-15 Thread Szilárd Páll

le to judge what is causing the problem. Cheers, -- Szilárd > Best regards, > Guanglei > > > On Mon, Sep 9, 2013 at 4:35 PM, Szilárd Páll wrote: > >> HI, >> >> First of all, icc 11 is not well tested and there have been reports >> about it compiling broken

Re: [gmx-users] Re: Gromacs: GPU detection

2013-09-13 Thread Szilárd Páll

FYI, I've file a bug report which you can track if interested: http://redmine.gromacs.org/issues/1334 -- Szilárd On Sun, Sep 1, 2013 at 9:49 PM, Szilárd Páll wrote: > I may have just come across this issue as well. I have no time to > investigate, but my guess is that it's

Re: [gmx-users] segfault on Gromacs 4.6.3 (cuda)

2013-09-09 Thread Szilárd Páll

HI, First of all, icc 11 is not well tested and there have been reports about it compiling broken code. This could explain the crash, but you'd need to do a bit more testing to confirm. Regading the GPU detection error, if you use a driver which is incompatible with the CUDA runtime (at least as h

Re: [gmx-users] gromacs 4.6.3 and Intel compiiler 11.x

2013-09-03 Thread Szilárd Páll

On Tue, Sep 3, 2013 at 9:50 PM, Guanglei Cui wrote: > Hi Mark, > > I agree with you and Justin, but let's just say there are things that are > out of my control ;-) I just tried SSE2 and NONE. Both failed the > regression check. That's alarming, with GMX_CPU_ACCELERATION=None only the plain C ker

Re: [gmx-users] Long range Lennard Jones

2013-09-02 Thread Szilárd Páll

On Thu, Aug 29, 2013 at 7:18 AM, Gianluca Interlandi wrote: > Justin, > > I respect your opinion on this. However, in the paper indicated below by BR > Brooks they used a cutoff of 10 A on LJ when testing IPS in CHARMM: > > Title: Pressure-based long-range correction for Lennard-Jones interactions

Re: [gmx-users] Re: Gromacs: GPU detection

2013-09-01 Thread Szilárd Páll

I may have just come across this issue as well. I have no time to investigate, but my guess is that it's related to some thread-safety issue with thread-MPI. Could one of you please file a bug report on redmine.gromacs.org? Cheers, -- Szilárd On Thu, Aug 8, 2013 at 5:52 PM, Brad Van Oosten wro

Re: [gmx-users] Gromacs: GPU detection

2013-08-07 Thread Szilárd Páll

That should never happen. If mdrun is compiled with GPU support and GPUs are detected, the detection stats should always get printed. Can you reliably reproduce the issue? -- Szilárd On Fri, Aug 2, 2013 at 9:50 AM, Jernej Zidar wrote: > Hi there. > Lately I've been running simulations using G

Re: Re: Re: [gmx-users] GPU-based workstation

2013-08-01 Thread Szilárd Páll

erties to the MB should I consider for such system ? >> >> James >> >> >> 2013/5/28 lloyd riggs >> >>> Dear Dr. Pali, >>> >>> Thank you, >>> >>> Stephan Watkins >>> >>> *Gesendet:* Dienstag, 28. Mai 2013 um

Re: [gmx-users] Intel vs gcc compilers

2013-08-01 Thread Szilárd Páll

Hi, The Intel compilers are only recommended for pre-Bulldozer AMD processors (K10: Magny-Cours, Intanbul, Barcelona, etc.). On these, PME non-bonded kernels (not the RF or plain cut-off!) are 10-30% slower with gcc than with icc. The icc-gcc difference is the smallest with gcc 4.7, typically arou

Re: [gmx-users] Re: CUDA with QUADRO GPUs

2013-08-01 Thread Szilárd Páll

Dear Ramon, Thanks for the kind words! On Tue, Jun 18, 2013 at 10:22 AM, Ramon Crehuet Simon wrote: > Dear Szilard, > Thanks for your message. Your help is priceless and helps advance science > more than many publications. I extend that to many experts who kindly and > promptly answer question

Re: [gmx-users] Problems with REMD in Gromacs 4.6.3

2013-07-31 Thread Szilárd Páll

On Fri, Jul 19, 2013 at 6:59 PM, gigo wrote: > Hi! > > > On 2013-07-17 21:08, Mark Abraham wrote: >> >> You tried ppn3 (with and without --loadbalance)? > > > I was testing on 8-replicas simulation. > > 1) Without --loadbalance and -np 8. > Excerpts from the script: > #PBS -l nodes=8:ppn=3 > seten

Re: [gmx-users] GROMACS 4.6.3 Static Linking

2013-07-31 Thread Szilárd Páll

On Thu, Jul 25, 2013 at 5:55 PM, Mark Abraham wrote: > That combo is supposed to generate a CMake warning. > > I also get a warning during linking that some shared library will have > to provide some function (getpwuid?) at run time, but the binary is > static. That warning has always popped up f

Re: [gmx-users] Running GROMACS on mini GPU cluster

2013-07-29 Thread Szilárd Páll

The message is perfectly normal. When you do not use all available cores/hardware threads (seen as "CPUs" by the OS), to avoid potential clashes, mdrun does not pin threads (i.e. it lets the OS migrate threads). On NUMA systems (most multi-CPU machines), this will cause performance degradation as w

Re: [gmx-users] Multi-level parallelization: MPI + OpenMP

2013-07-19 Thread Szilárd Páll

Depending on the level of parallelization (number of nodes and number of particles/core) you may want to try: - 2 ranks/node: 8 cores + 1 GPU, no separate PME (default): mpirun -np 2*Nnodes mdrun_mpi [-gpu_id 01 -npme 0] - 4 ranks per node: 4 cores + 1 GPU (shared between two ranks), no separat

Re: [gmx-users] 4.6.3 and MKL

2013-07-11 Thread Szilárd Páll

FYI: The MKL FFT has been shown to be up to 30%+ slower than FFTW 3.3. -- Szilárd On Thu, Jul 11, 2013 at 1:17 AM, Éric Germaneau wrote: > I have the same feeling too but I'm not in charge of it unfortunately. > Thank you, I appreciate. > > > On 07/11/2013 07:15 AM, Mark Abraham wrote: >> >> No

Re: [gmx-users] FW: Inconsistent results between 3.3.3 and 4.6 with various set-up options

2013-07-10 Thread Szilárd Páll

Just a note regarding the performance "issues" mentioned. You are using reaction-field electrostatics case in which by default there is very little force workload left for the CPU (only the bondeds) and therefore the CPU idles most of the time. To improve performance, use -nb gpu_cpu with multiple

Re: [gmx-users] Problem with running REMD in Gromacs 4.6.3

2013-07-10 Thread Szilárd Páll

Hi, Is affinity setting (pinning) on? What compiler are you using? There are some known issues with Intel OpenMP getting in the way of the internal affinity setting. To verify whether this is causing a problem, try turning of pinning (-pin off). Cheers, -- Szilárd On Tue, Jul 9, 2013 at 5:29 PM

Re: [gmx-users] cuda problem

2013-07-09 Thread Szilárd Páll

On Tue, Jul 9, 2013 at 11:20 AM, Albert wrote: > On 07/09/2013 11:15 AM, Szilárd Páll wrote: >> >> Tesla C1060 is not compatible - which should be shown in the log and >> standard output. >> >> Cheers, >> -- >> Szilárd > > > THX for kind comme

Re: [gmx-users] cuda problem

2013-07-09 Thread Szilárd Páll

Tesla C1060 is not compatible - which should be shown in the log and standard output. Cheers, -- Szilárd On Tue, Jul 9, 2013 at 10:54 AM, Albert wrote: > Dear: > > I've installed a gromacs-4.6.3 in a GPU cluster, and I obtained the > following information for testing: > > NOTE: Using a GPU wit

Re: [gmx-users] cuda problem

2013-07-09 Thread Szilárd Páll

PS: the error message is referring the to *driver* version, not the CUDA toolkit/runtime version. -- Szilárd On Tue, Jul 9, 2013 at 11:15 AM, Szilárd Páll wrote: > Tesla C1060 is not compatible - which should be shown in the log and > standard output. > > Cheers, > -- > Sz

Re: [gmx-users] Gromacs GPU system question

2013-07-04 Thread Szilárd Páll

On Mon, Jun 24, 2013 at 4:43 PM, Szilárd Páll wrote: > On Sat, Jun 22, 2013 at 5:55 PM, Mirco Wahab > wrote: >> On 22.06.2013 17:31, Mare Libero wrote: >>> >>> I am assembling a GPU workstation to run MD simulations, and I was >>> wondering if anyone has a

Re: [gmx-users] Installation on Ubuntu 12.04LTS

2013-07-04 Thread Szilárd Páll

autocomplete). > > I am still trying to fix the issues with the intel compiler. The gcc > compiled version benchmark at 52ns/day with the lysozyme in water tutorial. icc 12 and 13 should just work with CUDA 5.0. Cheers, -- Szilárd > > Thanks again. > >

Re: [gmx-users] fftw compile error for 4.6.2

2013-07-04 Thread Szilárd Páll

FYI: 4.6.2 contains a bug related to thread affinity setting which will lead to a considerable performance loss (I;ve seen 35%) as well as often inconsistent performance - especially with GPUs (case in which one would run many OpenMP threads/rank). My advice is that you either use the code from git

Re: [gmx-users] Installation on Ubuntu 12.04LTS

2013-06-27 Thread Szilárd Páll

On Thu, Jun 27, 2013 at 12:57 PM, Mare Libero wrote: > Hello everybody, > > Does anyone have any recommendation regarding the installation of gromacs 4.6 > on Ubuntu 12.04? I have the nvidia-cuda-toolkit that comes in synaptic > (4.0.17-3ubuntu0.1 installed in /usr/lib/nvidia-cuda-toolkit) and t

Re: [gmx-users] Gromacs GPU system question

2013-06-26 Thread Szilárd Páll

Thanks Mirco, good info, your numbers look quite consistent. The only complicating factor is that your CPUs are overclocked by different amounts, which changes the relative performances somewhat compared to non-overclocked parts. However, let me list some prices to show that the top-of-the line AM

Re: [gmx-users] TPI Results differ in v4.5.7 and v4.6.1

2013-06-24 Thread Szilárd Páll

If you have a solid example that reproduced the problem, feel free to file an issue on redmine.gromacs.org ASAP. Briefly documenting your experiments and verification process on the issue report page can help help developers in giving you faster feedback as well as with accepting the report as a bu

Re: [gmx-users] Gromacs GPU system question

2013-06-24 Thread Szilárd Páll

On Sat, Jun 22, 2013 at 5:55 PM, Mirco Wahab wrote: > On 22.06.2013 17:31, Mare Libero wrote: >> >> I am assembling a GPU workstation to run MD simulations, and I was >> wondering if anyone has any recommendation regarding the GPU/CPU >> combination. >> From what I can see, the GTX690 could be th

Re: [gmx-users] Gromacs GPU system question

2013-06-24 Thread Szilárd Páll

I strongly suggest that you consider the single-chip GTX cards instead of a dual-chip one; from the point of view of price/performance you'll probably get the most from a 680 or 780. You could ask why, so here are the reasons: - The current parallelization scheme requires domain-decomposition to u

Re: [gmx-users] CUDA with QUADRO GPUs?

2013-06-17 Thread Szilárd Páll

Dear Ramon, Compute capability does not reflect the performance of a card, but it is an indicator of what functionalities does the GPU provide - more like a generation number or feature set version. Quadro cards are typically quite close in performance/$ to Teslas with roughly 5-8x *lower* "GROMA

Re: [gmx-users] Re: mdrun segmentation fault for new build of gromacs 4.6.1

2013-06-11 Thread Szilárd Páll

-missing-field-initializers > -Wno-sign-compare -Wall -Wno-unused -Wunused-value -fomit-frame-pointer > -funroll-all-loops -fexcess-precision=fast -O3 -DNDEBUG > > > All the regressiontests failed. So it appears that, at least for my system, > I need to include the direc

Re: [gmx-users] Re: mdrun segmentation fault for new build of gromacs 4.6.1

2013-06-10 Thread Szilárd Páll

Amil, It looks like there is a mixup in your software configuration and mdrun is linked against libguide.so, the OpenMP library part of the Intel compiler v11 which gets loaded early and is probably causing the crash. This library was probably pulled in implicitly by MKL which the build system det

Re: [gmx-users] Running gmx-4.6.x over multiple homogeneous nodes with GPU acceleration

2013-06-09 Thread Szilárd Páll

On Wed, Jun 5, 2013 at 4:35 PM, João Henriques wrote: > Just to wrap up this thread, it does work when the mpirun is properly > configured. I knew it had to be my fault :) > > Something like this works like a charm: > mpirun -npernode 2 mdrun_mpi -ntomp 8 -gpu_id 01 -deffnm md -v That is indeed t

Re: [gmx-users] GPU ECC question

2013-06-09 Thread Szilárd Páll

On Sat, Jun 8, 2013 at 9:21 PM, Albert wrote: > Hello: > > Recently I found a strange question about Gromacs-4.6.2 on GPU workstaion. > In my GTX690 machine, when I run md production I found that the ECC is on. > However, in my another GTX590 machine, I found the ECC was off: > > 4 GPUs detected:

Re: [gmx-users] problems with GROMACS 4.6.2

2013-06-04 Thread Szilárd Páll

Just a few minor details: - You can set the affinities yourself through the job scheduler which should give nearly identical results compared to the mdrun internal affinity if you simply assign cores to mdrun threads in a sequential order (or with an #physical cores stride if you want to use Hyper

Re: [gmx-users] GPU problem

2013-06-04 Thread Szilárd Páll

"-nt" is mostly a backward compatibility option and sets the total number of threads (per rank). Instead, you should set both "-ntmpi" (or -np with MPI) and "-ntomp". However, note that unless a single mdrun uses *all* cores/hardware threads on a node, it won't pin the threads to cores. Failing to

Re: [gmx-users] Running gmx-4.6.x over multiple homogeneous nodes with GPU acceleration

2013-06-04 Thread Szilárd Páll

mdrun is not blind, just the current design does report the hardware of all compute nodes used. Whatever CPU/GPU hardware mdrun reports in the log/std output is *only* what rank 0, i.e. the first MPI process, detects. If you have a heterogeneous hardware configuration, in most cases you should be a

Re: [gmx-users] How to compile/run Gromacs on native Infiniband?

2013-06-03 Thread Szilárd Páll

There's no ibverbs support, s o pick your favorite/best MPI implementation, more than that you can't do. -- Szilárd On Mon, Jun 3, 2013 at 2:54 PM, Bert wrote: > Dear all, > > My cluster has a FDR (56 Gb/s) Infiniband network. It is well known that > there is a big difference between using IPoIB

Re: [gmx-users] gmx 4.6.2 segementation fault (core dump)

2013-06-03 Thread Szilárd Páll

gner > PhD Student, MBM Group > > Klaus Tschira Lab (KTL) > Max Planck Partner Institut for Computational Biology (PICB) > 320 YueYang Road > 200031 Shanghai, China > > phone: +86-21-54920475 > email: johan...@picb.ac.cn > > and > > Heidelberg Institut for Theore

Re: [gmx-users] gmx 4.6.2 segementation fault (core dump)

2013-06-03 Thread Szilárd Páll

Thanks for reporting this. he best would be a redmine bug with a tpr, command line invocation for reproduction as well log output to see what software and hardware configuration are you using. Cheers, -- Szilárd On Mon, Jun 3, 2013 at 2:46 PM, Johannes Wagner wrote: > Hi there, > trying to set

Re: [gmx-users] Re: GPU-based workstation

2013-05-28 Thread Szilárd Páll

On Tue, May 28, 2013 at 10:14 AM, James Starlight wrote: > I've found GTX Titat with 6gb of RAM and 384 bit. The price of such card is > equal to the price of the latest TESLA cards. Nope! Titan: $1000 Tesla K10: $2750 Tesla K20(c): $3000 TITAN is cheaper than any Tesla and the fastest of all N

Re: Aw: Re: [gmx-users] GPU-based workstation

2013-05-28 Thread Szilárd Páll

On Sat, May 25, 2013 at 2:16 PM, Broadbent, Richard wrote: > I've been running on my Universities GPU nodes these are one E5-xeon (6-cores > 12 threads) and have 4 Nvidia 690gtx's. My system is 93 000 atoms of DMF > under NVE. The performance has been a little disappointing That sounds like a

Re: Re: [gmx-users] GPU-based workstation

2013-05-28 Thread Szilárd Páll

Dear all, As far as I understand, the OP is interested in hardware for *running* GROMACS 4.6 rather than developing code. or running LINPACK. To get best performance it is important to use a machine with hardware balanced for GROMACS' workloads. Too little GPU resources will result in CPU idling

Re: [gmx-users] About Compilation error in gromacs 4.6

2013-05-28 Thread Szilárd Páll

10.04 comes with gcc 4.3 and 4.4 which should both work (we even test them with Jenkins). Still, you should really get a newer gcc, especially if you have an 8-core AMD CPU (=> either Bulldozer or Piledriver) both of which are fully supported only by gcc 4.7 and later. Additionally, AFAIK the 2.6.

Re: [gmx-users] Re: Have your ever got a real NVE simulation (good energy conservation) in gromacs?

2013-05-25 Thread Szilárd Páll

With the verlet cutoff scheme (new in 4.6) you get much better control over the drift caused by (missed) short range interactions; you just set a maximum allowed target drift and the buffer will be calculated accordingly. Additionally, with the verlet scheme you are free to tweak the neighbor searc

Re: [gmx-users] compile Gromacs using Cray compilers

2013-05-20 Thread Szilárd Páll

The thread-MPI library provides the thread affinity setting functionality to mdrun, hence certain parts of it will always be compiled in, even with GMX_MPI=ON. Apparently, the Cray compiler does not like some of the thread-MPI headers. Feel free to file a bug report on redmine.gromacs.org, but *don

Re: [gmx-users] Comparing Gromacs versions

2013-05-17 Thread Szilárd Páll

On Fri, May 17, 2013 at 2:48 PM, Djurre de Jong-Bruinink wrote: > > >>The answer is in the log files, in particular the performance summary >>should indicate where is the performance difference. If you post your >>log files somewhere we can probably give further tips on optimizing >>your run confi

Re: [gmx-users] Comparing Gromacs versions

2013-05-17 Thread Szilárd Páll

The answer is in the log files, in particular the performance summary should indicate where is the performance difference. If you post your log files somewhere we can probably give further tips on optimizing your run configurations. Note that with such a small system the scaling with the group sch

Re: [gmx-users] Performance (GMX4.6.1): MPI vs Threads

2013-05-16 Thread Szilárd Páll

PS: if your compute-nodes are Intel of some recent architecture OpenMP-only parallelization can be considerably more efficient. For more details see http://www.gromacs.org/Documentation/Acceleration_and_parallelization -- Szilárd On Thu, May 16, 2013 at 7:26 PM, Szilárd Páll wrote: > I&#x

Re: [gmx-users] Performance (GMX4.6.1): MPI vs Threads

2013-05-16 Thread Szilárd Páll

I'm not sure what you mean by "threads". In GROMACS this can refer to either thread-MPI or OpenMP multi-threading. To run within a single compute node a default GROMACS installation using either of the two aforementioned parallelization methods (or a combination of the two) can be used. -- Szilárd

Re: [gmx-users] cudaStreamSynchronize failed

2013-05-10 Thread Szilárd Páll

Hi, Such an issue typically indicates a GPU kernel crash. This can be caused by a large variety of factors from program bug to GPU hardware problem. To do a simple check for the former please run with the CUDA memory checker, e.g: /usr/local/cuda/bin/cuda-memcheck mdrun [...] Additionally, as you

Re: [gmx-users] GPU job often stopped

2013-04-29 Thread Szilárd Páll

On Mon, Apr 29, 2013 at 3:51 PM, Albert wrote: > On 04/29/2013 03:47 PM, Szilárd Páll wrote: >> >> In that case, while it isn't very likely, the issue could be caused by >> some implementation detail which aims to avoid performance loss caused >> by an issue

Re: [gmx-users] GPU job often stopped

2013-04-29 Thread Szilárd Páll

e GPU while mdrun was running? Cheers, -- Szilárd On Mon, Apr 29, 2013 at 3:32 PM, Albert wrote: > On 04/29/2013 03:31 PM, Szilárd Páll wrote: >> >> The segv indicates that mdrun crashed and not that the machine was >> restarted. The GPU detection output (both on stderr and l

Re: [gmx-users] GPU job often stopped

2013-04-29 Thread Szilárd Páll

On Mon, Apr 29, 2013 at 2:41 PM, Albert wrote: > On 04/28/2013 05:45 PM, Justin Lemkul wrote: >> >> >> Frequent failures suggest instability in the simulated system. Check your >> .log file or stderr for informative Gromacs diagnostic information. >> >> -Justin > > > > my log file didn't have any

Re: [gmx-users] GPU job often stopped

2013-04-29 Thread Szilárd Páll

Have you tried running on CPUs only just to see if the issue persists? Unless the issue does not occur with the same binary on the same hardware running on CPUs only, I doubt it's a problem in the code. Do you have ECC on? -- Szilárd On Sun, Apr 28, 2013 at 5:27 PM, Albert wrote: > Dear: > >

Re: [gmx-users] Re: Illegal instruction (core dumped) - trjconv

2013-04-29 Thread Szilárd Páll

This error means that your binaries contain machine instructions that the processor you run them on does not support. The most probable cause is that you compiled the binaries on a machine with different architecture than the one you are running on. Cheers, -- Szilárd On Mon, Apr 29, 2013 at 11

Re: [gmx-users] compile error

2013-04-26 Thread Szilárd Páll

You got a warning at configure-time that the nvcc host compiler can't be set because the mpi compiler wrapper are used. Because of this, nvcc is using gcc to compile CPU code whick chokes on the icc flags. You can: - set CUDA_HOST_COMPILER to the mpicc backend, i.e. icc or - let cmake detect MPI an

Re: [gmx-users] How to use multiple nodes, each with 2 CPUs and 3 GPUs

2013-04-25 Thread Szilárd Páll

Hi, You should really check out the documentation on how to use mdrun 4.6: http://www.gromacs.org/Documentation/Acceleration_and_parallelization#Running_simulations Brief summary: when running on GPUs every domain is assigned to a set of CPU cores and a GPU, hence you need to start as many PP MPI

Re: [gmx-users] GROMACS 4.6 with GPU acceleration (double

2013-04-22 Thread Szilárd Páll

On Mon, Apr 22, 2013 at 8:49 AM, Albert wrote: > On 04/22/2013 08:40 AM, Mikhail Stukan wrote: >> >> Could you explain which hardware do you mean? As far as I know, K20X >> supports double precision, so I would assume that double precision GROMACS >> should be realizable on it. > > > Really? But m

Re: [gmx-users] GROMACS 4.6 with GPU acceleration (double presion)

2013-04-22 Thread Szilárd Páll

On Tue, Apr 9, 2013 at 6:52 PM, David van der Spoel wrote: > On 2013-04-09 18:06, Mikhail Stukan wrote: > >> Dear experts, >> >> I have the following question. I am trying to compile GROMACS 4.6.1 with >> GPU acceleration and have the following diagnostics: >> >> # cmake .. -DGMX_DOUBLE=ON -DGMX_B

Re: [gmx-users] Error in make install "no valid ELF RPATH". Cray XE6m

2013-04-20 Thread Szilárd Páll

Hi, Your problem will likely be solved by not writing the rpath to the binaries which can be accomplished by setting -DCMAKE_SKIP_RPATH=OFF. This will mean that you will have to make sure that the library path is set for mdrun to work. If that does not fully solve the problem, you might have to b

Re: [gmx-users] Building Single and Double Precision in 4.6.1?

2013-04-18 Thread Szilárd Páll

On Thu, Apr 18, 2013 at 6:17 PM, Mike Hanby wrote: > Thanks for the reply, so the next question, after I finish building single > precision non parallel, is there an efficient way to kick off the double > precision build, then the single precision mpi and so on? > > Or do I need to delete everyt

Re: [gmx-users] Re: cygwin_mpi_gmx installation

2013-04-13 Thread Szilárd Páll

On Sat, Apr 13, 2013 at 5:27 PM, Szilárd Páll wrote: > On Sat, Apr 13, 2013 at 3:30 PM, Mirco Wahab > wrote: >> On 12.04.2013 20:20, Szilárd Páll wrote: >>> >>> On Fri, Apr 12, 2013 at 3:45 PM, 라지브간디 wrote: >>>> >>>> Can cygwin recog

Re: [gmx-users] Re: cygwin_mpi_gmx installation

2013-04-13 Thread Szilárd Páll

On Sat, Apr 13, 2013 at 3:30 PM, Mirco Wahab wrote: > On 12.04.2013 20:20, Szilárd Páll wrote: >> >> On Fri, Apr 12, 2013 at 3:45 PM, 라지브간디 wrote: >>> >>> Can cygwin recognize the CUDA installed in win 7? if so, how do i link >>> them ? >> >&

Re: [gmx-users] Re: cygwin_mpi_gmx installation

2013-04-12 Thread Szilárd Páll

On Fri, Apr 12, 2013 at 3:45 PM, 라지브간디 wrote: > Thanks for your answers. I have uninstalled the mpi, have also reinstalled > the CUDA and got the same issue. As you have mentioned before I noticed that > it struggle to detect the CUDA. Do you mean that you reconfigured without MPI and with CUDA

Re: [gmx-users] cygwin_mpi_gmx installation

2013-04-12 Thread Szilárd Páll

Indeed it's strange. In fact, it seems that CUDA detection did not even run, there should be a message whether it found the toolkit or not just before the "Enabling native GPU acceleration" - and the enabling should not even happen without CUDA detected. Unrelated, but do you really need MPI with

Re: [gmx-users] K20 test

2013-04-11 Thread Szilárd Páll

Hi, No, it just means that *your simulation* does not scale. The question is very vague, hence impossible to answer without more details However, assuming that you are not running a, say, 5000 atom system over 6 nodes, the most probable reason is that you have 6 Sandy Bridge nodes with 12-16 core

Re: [gmx-users] General conceptual question about advantage of GPUs

2013-04-10 Thread Szilárd Páll

On Wed, Apr 10, 2013 at 4:24 PM, Szilárd Páll wrote: > Hi Andrew, > > As others have said, 40x speedup with GPUs is certainly possible, but more > often than not comparisons leading to such numbers are not entirely fair - > at least from a computational perspective. The most comm

Re: [gmx-users] help: load imbalance

2013-04-10 Thread Szilárd Páll

On Wed, Apr 10, 2013 at 4:50 PM, 申昊 wrote: > Hello, >I wanna ask some questions about load imbalance. > 1> Here are the messages resulted from grompp -f md.mdp -p topol.top -c > npt.gro -o md.tpr > >NOTE 1 [file md.mdp]: > The optimal PME mesh load for parallel simulations is below 0.5

Re: [gmx-users] About 4.6.1

2013-04-10 Thread Szilárd Páll

On Wed, Apr 10, 2013 at 4:48 PM, 陈照云 wrote: > I have tested gromacs-4.6.1 with k20. > But when I run the mdrun, I met some problems. > 1.GPU only support float accelerating? > Yes. > 2.Configure options are -DGMX_MPI ,-DGMX_DOUBLE . > But if I run parallely with mpirun, it would get wrong with

Re: [gmx-users] General conceptual question about advantage of GPUs

2013-04-10 Thread Szilárd Páll

Hi Andrew, As others have said, 40x speedup with GPUs is certainly possible, but more often than not comparisons leading to such numbers are not entirely fair - at least from a computational perspective. The most common case is when people compare legacy, poorly (SIMD)-optimized codes with some ne

Re: [gmx-users] GPU performance

2013-04-10 Thread Szilárd Páll

On Wed, Apr 10, 2013 at 3:34 AM, Benjamin Bobay wrote: > Szilárd - > > First, many thanks for the reply. > > Second, I am glad that I am not crazy. > > Ok so based on your suggestions, I think I know what the problem is/was. > There was a sander process running on 1 of the CPUs. Clearly GROMACS

Re: [gmx-users] GPU performance

2013-04-09 Thread Szilárd Páll

Hi Ben, That performance is not reasonable at all - neither for CPU only run on your quad-core Sandy Bridge, nor for the CPU+GPU run. For the latter you should be getting more like 50 ns/day or so. What's strange about your run is that the CPU-GPU load balancing is picking a *very* long cut-off w

Re: [gmx-users] GROMACS 4.6v - Myrinet2000

2013-04-08 Thread Szilárd Páll

On Mon, Apr 8, 2013 at 1:37 PM, Justin Lemkul wrote: > On Mon, Apr 8, 2013 at 2:28 AM, Hrachya Astsatryan wrote: > > > Dear all, > > > > We have installed the latest version of Gromacs (version 4.6) on our > > cluster by the following step: > > > > * cmake .. -DGMX_MPI=ON -DCMAKE_INSTALL_PREFIX

Re: [gmx-users] gmx 4.6 mpi installation through openmpi?

2013-04-05 Thread Szilárd Páll

Hi, As the error message states, the reason for the failed configuration is that CMake can't auto-detect MPI which is needed when you are not providing the MPI compiler wrapper as compiler. If you want to build with MPI you can either let CMake auto-detect MPI and just compile with the C compiler

Re: [gmx-users] About the configuration of Gromacs on multiple nodes with GPU

2013-03-30 Thread Szilárd Páll

Hi, You can certainly use your hardware setup. I assume you've been looking at the log/console output based on which it might seem that mdrun is only using the GPUs in the first (=master) node. However, that is not the case, it's just that the current hardware and launch configuration reporting is

Re: [gmx-users] no CUDA-capable device is detected

2013-03-28 Thread Szilárd Páll

> > > -- > Chandan kumar Choudhury > NCL, Pune > INDIA > > > On Thu, Mar 28, 2013 at 4:26 PM, Chandan Choudhury >wrote: > > > > > On Thu, Mar 28, 2013 at 4:09 PM, Szilárd Páll >wrote: > > > >> Hi, > >> > >> If mdrun s

Re: [gmx-users] no CUDA-capable device is detected

2013-03-28 Thread Szilárd Páll

Hi, If mdrun says that it could not detect GPUs it simply means that the GPU enumeration found no GPUs, otherwise it would have printed what was found. This is rather strange because mdrun uses the same mechanism the deviceQuery SDK example. I really don't have a good idea what could be the issue,

Re: [gmx-users] Mismatching number of PP MPI processes and GPUs per node

2013-03-22 Thread Szilárd Páll

Hi, Actually, if you don't want to run across the network, with those Westmere processors you should be fine with running OpenMP across the two sockets, i.e mdrun -ntomp 24 or to run without HyperThreading (which can be sometimes faster) just use mdrun -ntomp 12 -pin on Now, when it comes to GPU

Re: [gmx-users] Mismatching number of PP MPI processes and GPUs per node

2013-03-21 Thread Szilárd Páll

FYI: On your machine running OpenMP across two sockets will probably not be very efficient. Depending on the input and at how high paralleliation are you running, you could be better off with running multiple MPI ranks per GPU. This is a bit of an unexplained feature due to it being complicated to

Re: [gmx-users] cuda gpu status on mdrun

2013-03-21 Thread Szilárd Páll

Hi Quentin, That's just a way of saying that something is wrong with either of the following (in order of possibility of the event): - your GPU driver is too old, hence incompatible with your CUDA version; - your GPU driver installation is broken; - your GPU is behaving in an unexpected/strange ma

Re: [gmx-users] Installing GROMACS4.6.1 on Intel MIC

2013-03-21 Thread Szilárd Páll

FYI: As much as Intel likes to say that you can "just run" MPI/MPI+OpenMP code on MIC, you will probably not be impressed with the performance (it will be *much* slower than a Xeon CPU). If you want to know why and what/when are we doing something about it, please read my earlier comments on MIC p

Re: [gmx-users] Gromacs with Intel Xeon Phi coprocessors ?

2013-03-12 Thread Szilárd Páll

Hi Chris, You should be able to run on MIC/Xeon Phi as these accelerators, when used in symmetric mode, behave just like a compute node. However, for two main reasons the performance will be quite bad: - no SIMD accelerated kernels for MIC; - no accelerator-specific parallelization implemented (as

Re: [gmx-users] Performance of 4.6.1 vs. 4.5.5

2013-03-09 Thread Szilárd Páll

As Mark said, we need concrete details to answer the question: - log files (all four of them: 1/2 nodes, 4.5/4.6) - hardware (CPUs, network) - compilers The 4.6 log files contain much of the second and third point except the network. Note that you can compare the performance summary table's entrie

Re: [gmx-users] Thread affinity setting failed

2013-03-08 Thread Szilárd Páll

elp out other > users! > > > > As an aside, I found that the OpenMP + Verlet combination was slower for > > this particular system, but I suspect that it's because it's almost > > entirely water and hence probably benefits from the Group scheme > > optimi

Re: [gmx-users] Re: [gmx-developers] Gromacs 4.6.1 ATLAS/CUDA detection problems...

2013-03-07 Thread Szilárd Páll

On Thu, Mar 7, 2013 at 2:02 PM, Berk Hess wrote: > > Hi, > > This was only a note, not a fix. > I was just trying to say that what linear algebra library you use for > Gromacs is irrelevant in more than 99% of the cases. > But having said that, the choice of library should not complicate the > co

1 2 3 4 >

1 - 100 of 302 matches

Mail list logo