This series implements some basic machine-independent optimizations. They simplify code and allow liveness analysis do it's work better.
Suppose we have following ARM code: movw r12, #0xb6db movt r12, #0xdb6d In TCG before optimizations we'll have: movi_i32 tmp8,$0xb6db mov_i32 r12,tmp8 mov_i32 tmp8,r12 ext16u_i32 tmp8,tmp8 movi_i32 tmp9,$0xdb6d0000 or_i32 tmp8,tmp8,tmp9 mov_i32 r12,tmp8 And after optimizations we'll have this: movi_i32 r12,$0xdb6db6db Here are performance evaluation results on SPEC CPU2000 integer tests in user-mode emulation on x86_64 host. There were 5 runs of each test on reference data set. The tables below show runtime in seconds for all these runs. ARM guest without optimizations: Test name #1 #2 #3 #4 #5 Median 164.gzip 1403.612 1403.499 1403.52 1208.55 1403.583 1403.52 175.vpr 1237.729 1238.008 1238.019 1176.852 1237.902 1237.902 176.gcc 929.511 928.867 929.048 928.927 928.792 928.927 181.mcf 196.371 196.335 196.172 197.057 196.196 196.335 186.crafty 1547.101 1547.293 1547.133 1547.037 1547.044 1547.101 197.parser 3804.336 3804.429 3804.412 3804.45 3804.301 3804.412 252.eon 2760.414 2760.45 2473.608 2760.606 2760.216 2760.414 253.perlbmk 2557.966 2558.971 2559.731 2479.299 2556.835 2557.966 256.bzip2 1296.412 1296.215 1296.63 1296.489 1296.092 1296.412 300.twolf 2919.496 2919.444 2919.529 2919.384 2919.404 2919.444 ARM guest with optimizations: Test name #1 #2 #3 #4 #5 Median Gain 164.gzip 1345.416 1401.741 1377.022 1401.737 1401.773 1401.737 0.13% 175.vpr 1116.75 1243.213 1243.32 1243.316 1243.144 1243.213 -0.43% 176.gcc 897.045 909.568 850.1 909.65 909.57 909.568 2.08% 181.mcf 199.058 198.717 198.28 198.866 197.955 198.717 -1.21% 186.crafty 1525.667 1526.663 1525.981 1525.995 1526.164 1525.995 1.36% 197.parser 3749.453 3749.522 3749.413 3749.5 3749.484 3749.484 1.44% 252.eon 2730.593 2746.525 2746.495 2746.493 2746.62 2746.495 0.50% 253.perlbmk 2577.341 2521.057 2578.461 2578.721 2581.313 2578.461 -0.80% 256.bzip2 1184.498 1190.116 1294.352 1294.554 1294.637 1294.352 0.16% 300.twolf 2894.264 2894.133 2894.398 2894.103 2894.146 2894.146 0.87% x86_64 guest without optimizations: Test name #1 #2 #3 #4 #5 Median 164.gzip 858.118 858.151 858.09 858.139 858.122 858.122 175.vpr 956.361 956.465 956.521 956.438 956.705 956.465 176.gcc 647.275 647.465 647.186 647.294 647.268 647.275 181.mcf 219.239 221.964 220.244 220.74 220.559 220.559 186.crafty 1128.027 1128.071 1128.028 1128.115 1128.123 1128.071 197.parser 1815.669 1815.651 1815.669 1815.711 1815.759 1815.669 253.perlbmk 1777.143 1777.749 1667.508 1777.051 1778.391 1777.143 254.gap 1062.808 1062.758 1062.801 1063.099 1062.859 1062.808 255.vortex 1930.693 1930.706 1930.579 1930.7 1930.566 1930.693 256.bzip2 1014.566 1014.702 1014.6 1014.274 1014.421 1014.566 300.twolf 1342.653 1342.759 1344.092 1342.641 1342.794 1342.759 x86_64 guest with optimizations: Test name #1 #2 #3 #4 #5 Median Gain 164.gzip 857.485 857.457 857.475 857.509 857.507 857.485 0.07% 175.vpr 963.255 962.972 963.27 963.124 963.686 963.255 -0.71% 176.gcc 644.123 644.055 644.145 643.818 635.773 644.055 0.50% 181.mcf 216.215 217.549 218.744 216.437 217.83 217.549 1.36% 186.crafty 1128.873 1128.792 1128.871 1128.816 1128.823 1128.823 -0.07% 197.parser 1814.626 1814.503 1814.552 1814.602 1814.748 1814.602 0.06% 253.perlbmk 1758.056 1751.963 1753.267 1765.27 1759.828 1758.056 1.07% 254.gap 1064.702 1064.712 1064.629 1064.657 1064.645 1064.657 -0.17% 255.vortex 1760.638 1936.387 1937.871 1937.471 1760.496 1936.387 -0.29% 256.bzip2 1007.658 1007.682 1007.316 1007.982 1007.747 1007.682 0.68% 300.twolf 1334.139 1333.791 1333.795 1334.147 1333.732 1333.795 0.67% ARM guests for 254.gap and 255.vortex and x86_64 guest for 252.eon does not work under QEMU for some unrelated reason. Kirill Batuzov (6): Add TCG optimizations stub Add copy and constant propagation. Do constant folding for basic arithmetic operations. Do constant folding for boolean operations. Do constant folding for shift operations. Do constant folding for unary operations. Makefile.target | 2 +- tcg/optimize.c | 539 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ tcg/tcg.c | 6 + tcg/tcg.h | 3 + 4 files changed, 549 insertions(+), 1 deletions(-) create mode 100644 tcg/optimize.c -- 1.7.4.1