On Thu, Feb 4, 2016 at 1:34 AM, Kyrill Tkachov <kyrylo.tkac...@foss.arm.com> wrote: > > On 04/02/16 09:13, Ramana Radhakrishnan wrote: >> >> On Fri, Jan 22, 2016 at 9:52 AM, Kyrill Tkachov >> <kyrylo.tkac...@foss.arm.com> wrote: >>> >>> Hi all, >>> >>> PR target/65932 is a wrong-code bug affecting arm and has manifested >>> itself >>> when compiling the Linux kernel, so it's something that we really >>> ought to fix. The problem stems from the fact that PROMOTE_MODE and >>> TARGET_PROMOTE_FUNCTION_MODE don't match up on arm. >>> PROMOTE_MODE also marks the promotion as unsigned, whereas the >>> TARGET_PROMOTE_FUNCTION_MODE doesn't. This can lead to short variables >>> being wrongly zero-extended instead of sign-extended. This also occurs >>> in PR target/67714. >>> >>> Jim Wilson tried a few approaches and from the discussion >>> on the PR and on the ML >>> (https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00814.html) >>> the preferred approach is to make PROMOTE_MODE and >>> TARGET_PROMOTE_FUNCTION_MODE >>> match up. Changing TARGET_PROMOTE_FUNCTION_MODE to zero-extend would be >>> an >>> ABI >>> change so we don't want to do that. Changing PROMOTE_MODE to not >>> zero-extend >>> fixes both PR 65932 and 67714. So Jim's patch is the first patch in this >>> series. >>> >>> It has been posted at >>> https://gcc.gnu.org/ml/gcc-patches/2015-06/msg02132.html >>> and this series is based on top of the arm.h hunk of that patch. >>> >>> There have been some concerns about the codegen quality fallout from >>> Jim's >>> patch, which is what the remaining patches in this series address: >>> >>> * 3 new arm testsuite failures: gcc.target/arm/wmul-[123].c. >>> wmul-1.c and wmul-2.c are cases where we no longer generate >>> sign-extend+multiply (and accumulate) instructions but instead generate >>> the normal full-width multiplies (the operands are sign-extended from >>> preceeding sign-extending loads). This is a regression for some targets >>> on which the sign-extending form is faster. Patches 2 and 3 address this. >>> gcc.target/arm/wmul-3.c is a test where we actually end up generating >>> better code and so the testcase just needs to be adjusted. >>> Patch 4 deals with that. >>> >>> * Sign-extending rather than zero-extending short values means we make >>> more >>> use of the ldrsb and ldrsh arm instructions rather than the >>> zero-extending >>> ldrb and ldrh. On Thumb1 targets ldrsb and ldrsh have more limited >>> addressing >>> modes (only REG + REG), which could hurt code size. However, the change >>> also >>> means that we can now merge sequences of load-zero-extend followed by a >>> sign-extend >>> into a single load-sign-extend. >>> So we'd turn a (ldrh ; sxth) into an (ldrsh). >>> >>> I've built a few popular embedded benchmarks for a Cortex-M0 target >>> (Thumb1) >>> and looked >>> at the generated assembly for -Os and -O2. I did see both effects. That >>> is, >>> ldrh instructions with an immediate being turned into two instructions: >>> ldrh r4, [r4, #12] >>> -> >>> movs r5, #12 >>> ldrsh r4, [r4, r5] >>> >>> But I also observed the beneficial effect. Various register allocation >>> perturbations also featured in the changes >>> Overall, for every codebase I've looked at both -Os and -O2 the new code >>> was >>> slightly smaller. Probably not enough to call this an outright win, but >>> definitely not a regression overall. >>> >>> As for ARM and Thumb2 states, this series doesn't have a major impact. >>> SPEC2006 bencharks are slightly reduced in size (but nothing to write >>> home >>> about) and there are no code quality regressions. And even the >>> regressions >>> caught by the testsuite in the wmul-[12].c tests don't actually manifest >>> in practice in SPEC, but they are fixed anyway. >>> >>> The series contains one target-independent change to CSE in patch 3 that >>> I'll explain there. >>> >>> The series has been bootstrapped and tested on arm, aarch64 and x86_64. >>> Is this ok for trunk together with Jim's arm.h hunk from >>> g ? >> >> >> The whole series is OK provided the middle-end hunk has been approved. > > > Thanks. > Richi approved the midend hunk at: > https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01733.html
This caused: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69676 -- H.J.