65932

H.J. Lu Thu, 04 Feb 2016 11:36:09 -0800

On Thu, Feb 4, 2016 at 1:34 AM, Kyrill Tkachov
<kyrylo.tkac...@foss.arm.com> wrote:
>
> On 04/02/16 09:13, Ramana Radhakrishnan wrote:
>>
>> On Fri, Jan 22, 2016 at 9:52 AM, Kyrill Tkachov
>> <kyrylo.tkac...@foss.arm.com> wrote:
>>>
>>> Hi all,
>>>
>>> PR target/65932 is a wrong-code bug affecting arm and has manifested
>>> itself
>>> when compiling the Linux kernel, so it's something that we really
>>> ought to fix. The problem stems from the fact that PROMOTE_MODE and
>>> TARGET_PROMOTE_FUNCTION_MODE don't match up on arm.
>>> PROMOTE_MODE also marks the promotion as unsigned, whereas the
>>> TARGET_PROMOTE_FUNCTION_MODE doesn't. This can lead to short variables
>>> being wrongly zero-extended instead of sign-extended.  This also occurs
>>> in PR target/67714.
>>>
>>> Jim Wilson tried a few approaches and from the discussion
>>> on the PR and on the ML
>>> (https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00814.html)
>>> the preferred approach is to make PROMOTE_MODE and
>>> TARGET_PROMOTE_FUNCTION_MODE
>>> match up. Changing TARGET_PROMOTE_FUNCTION_MODE to zero-extend would be
>>> an
>>> ABI
>>> change so we don't want to do that.  Changing PROMOTE_MODE to not
>>> zero-extend
>>> fixes both PR 65932 and 67714.  So Jim's patch is the first patch in this
>>> series.
>>>
>>> It has been posted at
>>> https://gcc.gnu.org/ml/gcc-patches/2015-06/msg02132.html
>>> and this series is based on top of the arm.h hunk of that patch.
>>>
>>> There have been some concerns about the codegen quality fallout from
>>> Jim's
>>> patch, which is what the remaining patches in this series address:
>>>
>>> * 3 new arm testsuite failures: gcc.target/arm/wmul-[123].c.
>>> wmul-1.c and wmul-2.c are cases where we no longer generate
>>> sign-extend+multiply (and accumulate) instructions but instead generate
>>> the normal full-width multiplies (the operands are sign-extended from
>>> preceeding sign-extending loads).  This is a regression for some targets
>>> on which the sign-extending form is faster. Patches 2 and 3 address this.
>>> gcc.target/arm/wmul-3.c is a test where we actually end up generating
>>> better code and so the testcase just needs to be adjusted.
>>> Patch 4 deals with that.
>>>
>>> * Sign-extending rather than zero-extending short values means we make
>>> more
>>> use of the ldrsb and ldrsh arm instructions rather than the
>>> zero-extending
>>> ldrb and ldrh.  On Thumb1 targets ldrsb and ldrsh have more limited
>>> addressing
>>> modes (only REG + REG), which could hurt code size. However, the change
>>> also
>>> means that we can now merge sequences of load-zero-extend followed by a
>>> sign-extend
>>> into a single load-sign-extend.
>>> So we'd turn a (ldrh ; sxth) into an (ldrsh).
>>>
>>> I've built a few popular embedded benchmarks for a Cortex-M0 target
>>> (Thumb1)
>>> and looked
>>> at the generated assembly for -Os and -O2. I did see both effects. That
>>> is,
>>> ldrh instructions with an immediate being turned into two instructions:
>>>      ldrh    r4, [r4, #12]
>>> ->
>>>      movs    r5, #12
>>>      ldrsh    r4, [r4, r5]
>>>
>>> But I also observed the beneficial effect. Various register allocation
>>> perturbations also featured in the changes
>>> Overall, for every codebase I've looked at both -Os and -O2 the new code
>>> was
>>> slightly smaller. Probably not enough to call this an outright win, but
>>> definitely not a regression overall.
>>>
>>> As for ARM and Thumb2 states, this series doesn't have a major impact.
>>> SPEC2006 bencharks are slightly reduced in size (but nothing to write
>>> home
>>> about) and there are no code quality regressions. And even the
>>> regressions
>>> caught by the testsuite in the wmul-[12].c tests don't actually manifest
>>> in practice in SPEC, but they are fixed anyway.
>>>
>>> The series contains one target-independent change to CSE in patch 3 that
>>> I'll explain there.
>>>
>>> The series has been bootstrapped and tested on arm, aarch64 and x86_64.
>>> Is this ok for trunk together with Jim's arm.h hunk from
>>> g ?
>>
>>
>> The whole series is OK provided the middle-end hunk has been approved.
>
>
> Thanks.
> Richi approved the midend hunk at:
> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01733.html


This caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69676


-- 
H.J.

Re: [PATCH][ARM][0/4] Fixing PR target/65932

Reply via email to