Insane performance regression?

2002-01-19 Thread Duraid Madina
Hi all, I have a CPU-bound (well, 'malloc-bound' ;) program which takes about 20 seconds to run on a 'fast' PC (Pentium3-1000, Athlon XP1600 etc) - the source is available as http://www.idesign.fl.net.au/malloc_pain/malloc_pain.tar.gz (NOTE: you *will* need GCC 3 (or more recent) to compi

SSE bcopy etc.

2001-12-28 Thread Duraid Madina
Hi all, While we're on the subject of AMD processors... Has anyone considered adding Pentium 2/3/4/Athlon/Athlon XP support to the low level string/bytecopy routines? If we just supported SSE (1) that'd get us (okay, me) a pretty nice performance boost on the P2, P3, P4 and Athlon XP,