[gmx-users] Benchs on gcc and pgi

Jones de Andrade Tue, 16 May 2006 23:33:17 -0700

Hi all!

Well, first of all, sorry if it's on the wrong gromacs list, but from what I could see on the website I could not find a clear indication on where to put benchmarks.

Anyway, some time ago I asked the list for help on making this benchmarks, on which I want to compare different compilers. I've been able to compile and run the benchmarks for GCC (double and single precision) and portland (single precision). Unfortunatelly,m I could not make it work with Intel Compiler (yes, I will ask for help again later.. ;) ).

Well, here we go: first, the benchmarks with the CPU usage of about 98% (varies among the tests) that I've got. After, I put the same benchs, but with a "rescale" of the performances of each tests for a 100% CPU usage:

Machine	CPU/Core		Compiler	Clock (MHz)	Cache (kb)	Benchmark
Machine	Type	N	Compiler	Clock (MHz)	Cache (kb)	Villin	Lys/Cut	Lys/PME	DPPC	Poly-CH2	Average	Rate
Linux	Athlon	1	gcc	800	512	2412	622	456	41	1001	100	1.00
Linux	Athlon 64	1	gcc	1800	512	9607	2686	1778	178	4344	410	1.82
Linux	Athlon 64	1	gcc + acml	1800	512	9604	2687	1782	178	4336	410	1.82
Linux	Athlon 64	1	gcc (dp)	1800	512	5607	1633	1175	117	3420	264	1.17
Linux	Athlon 64	1	gcc + acml (dp)	1800	512	5604	1637	1174	118	3423	264	1.17
Linux	Athlon 64	1	portland	1800	512	9177	2500	1638	166	3905	384	1.71
Linux	Athlon 64	1	portland + acml	1800	512	9181	2499	1639	166	3905	384	1.71
Linux	Athlon 64	1	gcc {100%}	1800	512	9823	2730	1844	186	4455	420	1.87
Linux	Athlon 64	1	gcc + acml {100%}	1800	512	9820	2762	1815	182	4438	420	1.87
Linux	Athlon 64	1	gcc (dp) {100%}	1800	512	5716	1675	1205	120	3519	270	1.20
Linux	Athlon 64	1	gcc + acml (dp) {100%}	1800	512	5707	1672	1203	121	3500	269	1.20
Linux	Athlon 64	1	portland {100%}	1800	512	9355	2546	1668	171	3989	392	1.74
Linux	Athlon 64	1	portland + acml {100%}	1800	512	9368	2545	1671	169	3989	392	1.74

Well, let us see what I could conclude from here: firt, portland is worst than GCC compilers (not comparable, but worst). That's already bad. But, even worst, is the fact that the use of the ACML libraries or yeld very poor extra performance, or just lose the race against the common gcc compilation.

Anyone could tell me if this kind of behavior, of both PGI and acml use as external blas and lapack, is correct?

Also, is there any extra performance to be gained from the use of Intel Compilers on this architecture? Does anybody got the following type of error during compilation (in the 1/sqrt() optimized function) before?

***********************************************************************************

./mknb -software_invsqrt
>>> Gromacs nonbonded kernel generator (-h for help)
>>> Generating single precision functions in C.
>>> Using Gromacs software version of 1/sqrt(x).
make[5]: *** [kernel-stamp] Falha de segmentação
make[5]: Leaving directory
`/home/johannes/src/gromacs/gromacs-3.3/src/gmxlib/nonbonded/nb_kernel'
make[4]: ** [all-recursive] Erro 1
***********************************************************************************

Hope this can be of use to someone...

Also, thanks a lot for any and all help in advance. :)

Jones

P.S.: I was looking in the web site of the Folding @ Home that they are already trying AND getting some usefull results in making gromacs run on certain GPUs. I was wondering, if it become a reallity there, how long it would be expected to take to be available as a patch or for the official gromacs to compile? Those co-processors are like a dream for too much people in the field, and a GPU-Gromacs like the one they are developing would be a real jump in this subject! :D

_______________________________________________
gmx-users mailing list    gmx-users@gromacs.org
http://www.gromacs.org/mailman/listinfo/gmx-users
Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to [EMAIL PROTECTED]
Can't post? Read http://www.gromacs.org/mailing_lists/users.php

[gmx-users] Benchs on gcc and pgi

Reply via email to