> You mean the mpc8xx , but I'm also using the mpc832x which has a e300c2 > core and is capable of executing 2 insns in parallel if not in the same > Unit.
That should let you do a memory read and an add. (I can't remember if the ppc has 'add from memory' but that is likely to use both units anyway.) An infinitely unrolled loop will then be 4 clocks/byte (for 32bit). If you get to 3 for a real loop you are doing ok. Remember, unroll too much and you displace other code from the i-cache. Also the i-cache loads themselves kill you. (A hot-cache benchmark won't see this...) David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)