On Tue, Jun 30, 2015 at 2:15 AM, Jim Wilson <jim.wil...@linaro.org> wrote:
> This is my suggested fix for PR 65932, which is a linux kernel
> miscompile with gcc-5.1.
>
> The problem here is caused by a chain of events.  The first is that
> the relatively new eipa_sra pass creates fake parameters that behave
> slightly differently than normal parameters.  The second is that the
> optimizer creates phi nodes that copy local variables to fake
> parameters and/or vice versa.  The third is that the ouf-of-ssa pass
> assumes that it can emit simple move instructions for these phi nodes.
> And the fourth is that the ARM port has a PROMOTE_MODE macro that
> forces QImode and HImode to unsigned, but a
> TARGET_PROMOTE_FUNCTION_MODE hook that does not.  So signed char and
> short parameters have different in register representations than local
> variables, and require a conversion when copying between them, a
> conversion that the out-of-ssa pass can't easily emit.
>
> Ultimately, I think this is a problem in the arm backend.  It should
> not have a PROMOTE_MODE macro that is changing the sign of char and
> short local variables.  I also think that we should merge the
> PROMOTE_MODE macro with the TARGET_PROMOTE_FUNCTION_MODE hook to
> prevent this from happening again.
>
> I see four general problems with the current ARM PROMOTE_MODE definition.
> 1) Unsigned char is only faster for armv5 and earlier, before the sxtb
> instruction was added.  It is a lose for armv6 and later.
> 2) Unsigned short was only faster for targets that don't support
> unaligned accesses.  Support for these targets was removed a while
> ago, and this PROMODE_MODE hunk should have been removed at the same
> time.  It was accidentally left behind.
> 3) TARGET_PROMOTE_FUNCTION_MODE used to be a boolean hook, when it was
> converted to a function, the PROMOTE_MODE code was copied without the
> UNSIGNEDP changes.  Thus it is only an accident that
> TARGET_PROMOTE_FUNCTION_MODE and PROMOTE_MODE disagree.  Changing
> TARGET_PROMOTE_FUNCTION_MODE is an ABI change, so only PROMOTE_MODE
> changes to resolve the difference are safe.
> 4) There is a general principle that you should only change signedness
> in PROMOTE_MODE if the hardware forces it, as otherwise this results
> in extra conversion instructions that make code slower.  The mips64
> hardware for instance requires that 32-bit values be sign-extended
> regardless of type, and instructions may trap if this is not true.
> However, it has a set of 32-bit instructions that operate on these
> values, and hence no conversions are required.  There is no similar
> case on ARM. Thus the conversions are unnecessary and unwise.  This
> can be seen in the testcases where gcc emits both a zero-extend and a
> sign-extend inside a loop, as the sign-extend is required for a
> compare, and the zero-extend is required by PROMOTE_MODE.

Given Kyrill's testing with the patch and the reasonably detailed
check of the effects of code generation changes - The arm.h hunk is ok
- I do think we should make this explicit in the documentation that
TARGET_PROMOTE_MODE and TARGET_PROMOTE_FUNCTION_MODE should agree and
better still maybe put in a checking assert for the same in the
mid-end but that could be the subject of a follow-up patch.

Ok to apply just the arm.h hunk as I think Kyrill has taken care of
the testsuite fallout separately.


regards
Ramana




>
> My change was tested with an arm bootstrap, make check, and SPEC
> CPU2000 run.  The original poster verified that this gives a linux
> kernel that boots correctly.
>
> The PRMOTE_MODE change causes 3 testsuite testcases to fail.  These
> are tests to verify that smulbb and/or smlabb are generated.
> Eliminating the unnecessary sign conversions causes us to get better
> code that doesn't include the smulbb and smlabb instructions.  I had
> to modify the testcases to get them to emit the desired instructions.
> With the testcase changes there are no additional testsuite failures,
> though I'm concerned that these testcases with the changes may be
> fragile, and future changes may break them again.



>
> If there are ARM parts where smulbb/smlabb are faster than mul/mla,
> then maybe we should try to add new patterns to get the instructions
> emitted again for the unmodified testcases.
>
> Jim

Reply via email to