On Dec 20, 2007, at 10:22 PM, Markus Weissmann wrote:
On Dec 20, 2007, at 5:03 PM, Vincent Lefevre wrote:
On 2007-12-20 17:59:25 +1100, Joshua Root wrote:
[1] <http://lixom.net/~olof/64bit-perf.pdf>
[2] <http://www.geekpatrol.ca/2006/09/32-bit-vs-64-bit-performance/>
I don't understand why they say that 5 instructions are needed for
constants in 64-bit binaries. Can't the PowerPC load the constant
from the memory with a single instruction? This is the solution
chosen on the ARM for complex constants (if they are in the cache,
this should be fast enough). But many constants are simple enough
to be loaded with a single instruction (on the ARM, these are 8-bit
values rotated by an even number of positions), in particular after
optimizing the code.
If I remember correctly, all powerpc instructions have a length of
32 bit.
Given that you need some bits for the opcode, a mere 16 bit remain
to stuff a constant value to it (for the load high/add intermediate
instructions).
So, for a 64 bit value to load, you need to do a
-2x loadhigh (2x high 16 bit)
-2x add immediate (2x low 16 bit)
-1x some combine statement (some shift operation or whatever)
Keep in mind that these 64 bit constants only cost you for pointers.
If you want a 32 bit integer, you don't need to load 64 bit -- even
in 64 bit mode.
Oh, and don't forget that 64 bit Intel code is actually most often
faster than 32 bit code, thanks to the double amount of registers and
some other goodies;
I've compiled a 32bit/64bit universal `bzcat' and ran both version
three times on my Core 2 Duo machine. For this randomly chosen (!)
benchmark, I get an impressive edge of ~20% for 64 bit mode:
$ time ./bzcat-64 gcc-core-4.3-20071214.tar.bz2 >/dev/null
real 0m5.889s
user 0m5.168s
sys 0m0.096s
$ time ./bzcat-64 gcc-core-4.3-20071214.tar.bz2 >/dev/null
real 0m5.516s
user 0m5.120s
sys 0m0.088s
$ time ./bzcat-64 gcc-core-4.3-20071214.tar.bz2 >/dev/null
real 0m5.489s
user 0m5.137s
sys 0m0.085s
$ time ./bzcat-32 gcc-core-4.3-20071214.tar.bz2 >/dev/null
real 0m7.407s
user 0m6.707s
sys 0m0.107s
$ time ./bzcat-32 gcc43/gcc-core-4.3-20071214.tar.bz2 >/dev/null
real 0m6.966s
user 0m6.540s
sys 0m0.097s
$ time ./bzcat-32 gcc43/gcc-core-4.3-20071214.tar.bz2 >/dev/null
real 0m7.051s
user 0m6.583s
sys 0m0.103s
Regards,
-Markus
PS: To force the system to run `xy' in 32/64 bit mode (or even
Rosetta), just make a copy of the executable with just the arch you
want, e.g. `lipo -extract i386 bzcat -output bzcat-32' to extract 32
bit intel code from bzcat and store it into bzcat-32.
--
Dipl. Inf. (FH) Markus W. Weissmann
http://www.macports.org/
http://www.mweissmann.de/
_______________________________________________
macports-users mailing list
macports-users@lists.macosforge.org
http://lists.macosforge.org/mailman/listinfo/macports-users