For your Opteron, try with this option -O3 -fomit-frame-pointer -march=k8 -funroll-loops -finline-functions -fpeel-loops \ -mno-sse3 -msse2 -msse -mno-mmx -mno-3dnow
The Opteron hardware said that it's better to use SSE2 than SSE3. The MMX and 3DNow!+ instructions are shorter and older than SSE2/SSE instructions. Sincerely, J.C.Pizarro