Ping on the series: https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01717.html https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01714.html https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01716.html (already approved by Richi) https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01715.html
as well as https://gcc.gnu.org/ml/gcc-patches/2015-06/msg02132.html as described in the cover letter Thanks, Kyrill On 22/01/16 09:52, Kyrill Tkachov wrote:
Hi all, PR target/65932 is a wrong-code bug affecting arm and has manifested itself when compiling the Linux kernel, so it's something that we really ought to fix. The problem stems from the fact that PROMOTE_MODE and TARGET_PROMOTE_FUNCTION_MODE don't match up on arm. PROMOTE_MODE also marks the promotion as unsigned, whereas the TARGET_PROMOTE_FUNCTION_MODE doesn't. This can lead to short variables being wrongly zero-extended instead of sign-extended. This also occurs in PR target/67714. Jim Wilson tried a few approaches and from the discussion on the PR and on the ML (https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00814.html) the preferred approach is to make PROMOTE_MODE and TARGET_PROMOTE_FUNCTION_MODE match up. Changing TARGET_PROMOTE_FUNCTION_MODE to zero-extend would be an ABI change so we don't want to do that. Changing PROMOTE_MODE to not zero-extend fixes both PR 65932 and 67714. So Jim's patch is the first patch in this series. It has been posted at https://gcc.gnu.org/ml/gcc-patches/2015-06/msg02132.html and this series is based on top of the arm.h hunk of that patch. There have been some concerns about the codegen quality fallout from Jim's patch, which is what the remaining patches in this series address: * 3 new arm testsuite failures: gcc.target/arm/wmul-[123].c. wmul-1.c and wmul-2.c are cases where we no longer generate sign-extend+multiply (and accumulate) instructions but instead generate the normal full-width multiplies (the operands are sign-extended from preceeding sign-extending loads). This is a regression for some targets on which the sign-extending form is faster. Patches 2 and 3 address this. gcc.target/arm/wmul-3.c is a test where we actually end up generating better code and so the testcase just needs to be adjusted. Patch 4 deals with that. * Sign-extending rather than zero-extending short values means we make more use of the ldrsb and ldrsh arm instructions rather than the zero-extending ldrb and ldrh. On Thumb1 targets ldrsb and ldrsh have more limited addressing modes (only REG + REG), which could hurt code size. However, the change also means that we can now merge sequences of load-zero-extend followed by a sign-extend into a single load-sign-extend. So we'd turn a (ldrh ; sxth) into an (ldrsh). I've built a few popular embedded benchmarks for a Cortex-M0 target (Thumb1) and looked at the generated assembly for -Os and -O2. I did see both effects. That is, ldrh instructions with an immediate being turned into two instructions: ldrh r4, [r4, #12] -> movs r5, #12 ldrsh r4, [r4, r5] But I also observed the beneficial effect. Various register allocation perturbations also featured in the changes Overall, for every codebase I've looked at both -Os and -O2 the new code was slightly smaller. Probably not enough to call this an outright win, but definitely not a regression overall. As for ARM and Thumb2 states, this series doesn't have a major impact. SPEC2006 bencharks are slightly reduced in size (but nothing to write home about) and there are no code quality regressions. And even the regressions caught by the testsuite in the wmul-[12].c tests don't actually manifest in practice in SPEC, but they are fixed anyway. The series contains one target-independent change to CSE in patch 3 that I'll explain there. The series has been bootstrapped and tested on arm, aarch64 and x86_64. Is this ok for trunk together with Jim's arm.h hunk from g ? P.S. This also fixes PR middlen-end/67295 which is a testsuite regression on arm. Refer to the bugzilla entry there for the analysis of how this issue affects that PR. Thanks, Kyrill