On Fri, May 20, 2011 at 04:39:27PM +0400, Kirill Batuzov wrote: > This series implements some basic machine-independent optimizations. They > simplify code and allow liveness analysis do it's work better. > > Suppose we have following ARM code: > > movw r12, #0xb6db > movt r12, #0xdb6d > > In TCG before optimizations we'll have: > > movi_i32 tmp8,$0xb6db > mov_i32 r12,tmp8 > mov_i32 tmp8,r12 > ext16u_i32 tmp8,tmp8 > movi_i32 tmp9,$0xdb6d0000 > or_i32 tmp8,tmp8,tmp9 > mov_i32 r12,tmp8 > > And after optimizations we'll have this: > > movi_i32 r12,$0xdb6db6db > > Here are performance evaluation results on SPEC CPU2000 integer tests in > user-mode emulation on x86_64 host. There were 5 runs of each test on > reference data set. The tables below show runtime in seconds for all these > runs.
How are the tests done? Are they done with linux-user, or running the executables in qemu-system-xxx? > ARM guest without optimizations: > Test name #1 #2 #3 #4 #5 Median > 164.gzip 1403.612 1403.499 1403.52 1208.55 1403.583 1403.52 > 175.vpr 1237.729 1238.008 1238.019 1176.852 1237.902 1237.902 > 176.gcc 929.511 928.867 929.048 928.927 928.792 928.927 > 181.mcf 196.371 196.335 196.172 197.057 196.196 196.335 > 186.crafty 1547.101 1547.293 1547.133 1547.037 1547.044 1547.101 > 197.parser 3804.336 3804.429 3804.412 3804.45 3804.301 3804.412 > 252.eon 2760.414 2760.45 2473.608 2760.606 2760.216 2760.414 > 253.perlbmk 2557.966 2558.971 2559.731 2479.299 2556.835 2557.966 > 256.bzip2 1296.412 1296.215 1296.63 1296.489 1296.092 1296.412 > 300.twolf 2919.496 2919.444 2919.529 2919.384 2919.404 2919.444 > > ARM guest with optimizations: > Test name #1 #2 #3 #4 #5 Median Gain > 164.gzip 1345.416 1401.741 1377.022 1401.737 1401.773 1401.737 0.13% > 175.vpr 1116.75 1243.213 1243.32 1243.316 1243.144 1243.213 -0.43% > 176.gcc 897.045 909.568 850.1 909.65 909.57 909.568 2.08% > 181.mcf 199.058 198.717 198.28 198.866 197.955 198.717 -1.21% > 186.crafty 1525.667 1526.663 1525.981 1525.995 1526.164 1525.995 1.36% > 197.parser 3749.453 3749.522 3749.413 3749.5 3749.484 3749.484 1.44% > 252.eon 2730.593 2746.525 2746.495 2746.493 2746.62 2746.495 0.50% > 253.perlbmk 2577.341 2521.057 2578.461 2578.721 2581.313 2578.461 -0.80% > 256.bzip2 1184.498 1190.116 1294.352 1294.554 1294.637 1294.352 0.16% > 300.twolf 2894.264 2894.133 2894.398 2894.103 2894.146 2894.146 0.87% > > > x86_64 guest without optimizations: > Test name #1 #2 #3 #4 #5 Median > 164.gzip 858.118 858.151 858.09 858.139 858.122 858.122 > 175.vpr 956.361 956.465 956.521 956.438 956.705 956.465 > 176.gcc 647.275 647.465 647.186 647.294 647.268 647.275 > 181.mcf 219.239 221.964 220.244 220.74 220.559 220.559 > 186.crafty 1128.027 1128.071 1128.028 1128.115 1128.123 1128.071 > 197.parser 1815.669 1815.651 1815.669 1815.711 1815.759 1815.669 > 253.perlbmk 1777.143 1777.749 1667.508 1777.051 1778.391 1777.143 > 254.gap 1062.808 1062.758 1062.801 1063.099 1062.859 1062.808 > 255.vortex 1930.693 1930.706 1930.579 1930.7 1930.566 1930.693 > 256.bzip2 1014.566 1014.702 1014.6 1014.274 1014.421 1014.566 > 300.twolf 1342.653 1342.759 1344.092 1342.641 1342.794 1342.759 > > x86_64 guest with optimizations: > Test name #1 #2 #3 #4 #5 Median Gain > 164.gzip 857.485 857.457 857.475 857.509 857.507 857.485 0.07% > 175.vpr 963.255 962.972 963.27 963.124 963.686 963.255 -0.71% > 176.gcc 644.123 644.055 644.145 643.818 635.773 644.055 0.50% > 181.mcf 216.215 217.549 218.744 216.437 217.83 217.549 1.36% > 186.crafty 1128.873 1128.792 1128.871 1128.816 1128.823 1128.823 -0.07% > 197.parser 1814.626 1814.503 1814.552 1814.602 1814.748 1814.602 0.06% > 253.perlbmk 1758.056 1751.963 1753.267 1765.27 1759.828 1758.056 1.07% > 254.gap 1064.702 1064.712 1064.629 1064.657 1064.645 1064.657 -0.17% > 255.vortex 1760.638 1936.387 1937.871 1937.471 1760.496 1936.387 -0.29% > 256.bzip2 1007.658 1007.682 1007.316 1007.982 1007.747 1007.682 0.68% > 300.twolf 1334.139 1333.791 1333.795 1334.147 1333.732 1333.795 0.67% > > ARM guests for 254.gap and 255.vortex and x86_64 guest for 252.eon does not > work under QEMU for some unrelated reason. > > Kirill Batuzov (6): > Add TCG optimizations stub > Add copy and constant propagation. > Do constant folding for basic arithmetic operations. > Do constant folding for boolean operations. > Do constant folding for shift operations. > Do constant folding for unary operations. > > Makefile.target | 2 +- > tcg/optimize.c | 539 > +++++++++++++++++++++++++++++++++++++++++++++++++++++++ > tcg/tcg.c | 6 + > tcg/tcg.h | 3 + > 4 files changed, 549 insertions(+), 1 deletions(-) > create mode 100644 tcg/optimize.c > > -- > 1.7.4.1 > > > -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net