On 24/01/17 16:36, Bernd Schmidt wrote:
On 01/24/2017 05:30 PM, Kyrill Tkachov wrote:

The -mfpu is overridden in the testcase to add the ARMv8 instructions.
So to reproduce the compilation in that testcase you'd want
-mfpu=fp-armv8 or
something equivalent rather than vfpv3-d16-fp16.

Exact steps please. No one who's not well-versed in all the ARM variants will be able to figure this out. I've been able to generate identical before/after code, both with and without vselvs.f64 instructions, after trying out a number of switch combinations, but I've not been able to find a way to show where the patch makes a difference.


I was just off-hand trying to give the options that would be
expected to be exercised in this testcase.

Actually trying it out with an explicit -mcpu=cortex-a5 (so -O2 -S 
-mfpu=fp-armv8 -mcpu=cortex-a57 -mfloat-abi=hard) I get
the test failing before and after the patch. The code generated is
        vcmp.f64        d0, d1
        vmrs    APSR_nzcv, FPSCR
        vmovvs.f64      d0, d1
        bx      lr

whereas the desired (e.g. with -mcpu=cortex-a57) is:
        vcmp.f64        d0, d1
        vmrs    APSR_nzcv, FPSCR
        vselvs.f64      d0, d1, d0
        bx      lr

Given that VSEL is an ARMv8-A instruction and Cortex-A5 is an ARMv7-A cpu it 
doesn't make much sense to try getting
it to generate that VSEL. So maybe we should just include an explicit 
-mtune=cortex-a57 to the testcases.

Thanks,
Kyrill


Bernd


Reply via email to