J.C. Pizarro wrote:
For your Opteron, try with this option
-O3 -fomit-frame-pointer -march=k8 -funroll-loops -finline-functions
-fpeel-loops \
-mno-sse3 -msse2 -msse -mno-mmx -mno-3dnow
The Opteron hardware said that it's better to use SSE2 than SSE3.
The MMX and 3DNow!+ instructions are shorter and older than SSE2/SSE
instructions.
Interesting. With these flags, the peak was 39K/sec, and it didn't top out
until 272 client connections. (Quite a lengthy test; I'm running 2 minutes per
iteration with X number of clients, then increasing on the next iteration,
repeating until the transaction count stops growing. So this was over two
hours before it finally maxed out.) I guess this is a pretty good setting for
heavy scalability even though it didn't quite reach 40K/sec.
During these tests I see that about 94% of one core is consumed by interrupt
processing, with 2% idle time left. I guess this ~200K packet per second rate
is pretty near the limit of what this system can handle on gigabit ethernet.
I've seen this box hit as high as 43K auths/sec using 4 slapd processes with 3
threads each, as opposed to a single process with 8 threads. In that test 100%
of a core was doing interrupt processing.
Anyway, thanks again for all your responses.
--
-- Howard Chu
Chief Architect, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/