Hi all!

Well, first of all, sorry if it's on the wrong gromacs list, but from what I could see on the website I could not find a clear indication on where to put benchmarks.

Anyway, some time ago I asked the list for help on making this benchmarks, on which I want to compare different compilers. I've been able to compile and run the benchmarks for GCC (double and single precision) and portland (single precision). Unfortunatelly,m I could not make it work with Intel Compiler (yes, I will ask for help again later.. ;)  ).

Well, here we go: first, the benchmarks with the CPU usage of about 98% (varies among the tests) that I've got. After, I put the same benchs, but with a "rescale" of the performances of each tests for a 100% CPU usage:

Machine CPU/Core Compiler Clock (MHz) Cache (kb) Benchmark  
Type N Villin Lys/Cut Lys/PME DPPC Poly-CH2 Average Rate
Linux Athlon 1 gcc 800 512 2412 622 456 41 1001 100 1.00
Linux Athlon 64 1 gcc 1800 512 9607
2686
1778
178
4344
410
1.82
Linux Athlon 64 1 gcc + acml
1800 512 9604
2687
1782
178
4336
410
1.82
Linux Athlon 64 1 gcc (dp)
1800 512 5607
1633
1175
117
3420
264
1.17
Linux Athlon 64 1 gcc + acml (dp)
1800 512 5604
1637
1174
118
3423
264
1.17
Linux Athlon 64 1 portland
1800 512 9177
2500
1638
166
3905
384
1.71
Linux Athlon 64 1 portland + acml
1800 512 9181
2499
1639
166
3905
384
1.71
Linux Athlon 64 1 gcc {100%}
1800 512 9823
2730
1844
186
4455
420
1.87
Linux Athlon 64 1 gcc + acml {100%} 1800 512 9820
2762
1815
182
4438
420
1.87
Linux Athlon 64 1 gcc (dp) {100%} 1800 512 5716
1675
1205
120
3519
270
1.20
Linux Athlon 64 1 gcc + acml (dp) {100%} 1800 512 5707
1672
1203
121
3500
269
1.20
Linux Athlon 64 1 portland {100%} 1800 512 9355
2546
1668
171
3989
392
1.74
Linux Athlon 64 1 portland + acml {100%} 1800 512 9368
2545
1671
169
3989
392
1.74

Well, let us see what I could conclude from here: firt, portland is worst than GCC compilers (not comparable, but worst). That's already bad. But, even worst, is the fact that the use of the ACML libraries or yeld very poor extra performance, or just lose the race against the common gcc compilation.

Anyone could tell me if this kind of behavior, of both PGI and acml use as external blas and lapack, is correct?

Also, is there any extra performance to be gained from the use of Intel Compilers on this architecture? Does anybody got the following type of error during compilation (in the 1/sqrt() optimized function) before?

***********************************************************************************
./mknb   -software_invsqrt
 >>> Gromacs nonbonded kernel generator (-h for help)
 >>> Generating single precision functions in C.
 >>> Using Gromacs software version of 1/sqrt(x).
 make[5]: *** [kernel-stamp] Falha de segmentação
 make[5]: Leaving directory
`/home/johannes/src/gromacs/gromacs-3.3/src/gmxlib/nonbonded/nb_kernel'
 make[4]: ** [all-recursive] Erro 1
***********************************************************************************

Hope this can be of use to someone...

Also, thanks a lot for any and all help in advance. :)

Jones

P.S.: I was looking in the web site of the Folding @ Home that they are already trying AND getting some usefull results in making gromacs run on certain GPUs. I was wondering, if it become a reallity there, how long it would be expected to take to be available as a patch or for the official gromacs to compile? Those co-processors are like a dream for too much people in the field, and a GPU-Gromacs like the one they are developing would be a real jump in this subject! :D
_______________________________________________
gmx-users mailing list    gmx-users@gromacs.org
http://www.gromacs.org/mailman/listinfo/gmx-users
Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to [EMAIL PROTECTED]
Can't post? Read http://www.gromacs.org/mailing_lists/users.php

Reply via email to