On 13 September 2017 at 18:33, Kyrill Tkachov <kyrylo.tkac...@foss.arm.com> wrote: > Hi Christophe, > > > On 13/09/17 16:23, Christophe Lyon wrote: >> >> Hi, >> >> On 12 October 2016 at 11:22, Christophe Lyon <christophe.l...@linaro.org> >> wrote: >>> >>> On 12 October 2016 at 11:14, Kyrill Tkachov <kyrylo.tkac...@foss.arm.com> >>> wrote: >>>> >>>> On 12/10/16 09:59, Christophe Lyon wrote: >>>>> >>>>> Hi Kyrill, >>>>> >>>>> On 7 October 2016 at 17:00, Kyrill Tkachov >>>>> <kyrylo.tkac...@foss.arm.com> >>>>> wrote: >>>>>> >>>>>> Hi Christophe, >>>>>> >>>>>> >>>>>> On 07/09/16 21:05, Christophe Lyon wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> The attached patch is a first part to solve PR 67591: it removes >>>>>>> several occurrences of "IT blocks containing 32-bit Thumb >>>>>>> instructions are deprecated in ARMv8" messages in the >>>>>>> gcc/g++/libstdc++/fortran testsuites. >>>>>>> >>>>>>> It does not remove them all yet. This patch only modifies the >>>>>>> *cmp_and, *cmp_ior, *ior_scc_scc, *ior_scc_scc_cmp, >>>>>>> *and_scc_scc and *and_scc_scc_cmp patterns. >>>>>>> Additional work is required in sub_shiftsi etc, at least. >>>>>>> I've started looking at these, but I decided I could already >>>>>>> post this self-contained patch to check if this implementation >>>>>>> is OK. >>>>>>> >>>>>>> Regarding *cmp_and and *cmp_ior patterns, the addition of the >>>>>>> enabled_for_depr_it attribute is aggressive in the sense that it >>>>>>> keeps >>>>>>> only the alternatives with 'l' and 'Py' constraints, while in some >>>>>>> cases the constraints could be relaxed. Indeed, these 2 patterns can >>>>>>> swap their input comparisons, meaning that any of them can be emitted >>>>>>> in the IT-block, and is thus subject to the ARMv8 deprecation. >>>>>>> The generated code is possibly suboptimal in the cases where the >>>>>>> operands are not swapped, since 'r' could be used. >>>>>>> >>>>>>> Cross-tested on arm-none-linux-gnueabihf with -mthumb/-march=armv8-a >>>>>>> and --with-cpu=cortex-a57 --with-mode=thumb, showing only >>>>>>> improvements: >>>>>>> >>>>>>> >>>>>>> >>>>>>> http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-patches/239850-depr-it-4/report-build-info.html >>>>>>> >>>>>>> Bootstrapped OK on armv8l HW. >>>>>>> >>>>>>> Is this OK? >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Christophe >>>>>> >>>>>> >>>>>> (define_insn_and_split "*ior_scc_scc" >>>>>> - [(set (match_operand:SI 0 "s_register_operand" "=Ts") >>>>>> + [(set (match_operand:SI 0 "s_register_operand" "=Ts,Ts") >>>>>> (ior:SI (match_operator:SI 3 "arm_comparison_operator" >>>>>> - [(match_operand:SI 1 "s_register_operand" "r") >>>>>> - (match_operand:SI 2 "arm_add_operand" "rIL")]) >>>>>> + [(match_operand:SI 1 "s_register_operand" "r,l") >>>>>> + (match_operand:SI 2 "arm_add_operand" "rIL,lPy")]) >>>>>> (match_operator:SI 6 "arm_comparison_operator" >>>>>> - [(match_operand:SI 4 "s_register_operand" "r") >>>>>> - (match_operand:SI 5 "arm_add_operand" "rIL")]))) >>>>>> + [(match_operand:SI 4 "s_register_operand" "r,l") >>>>>> + (match_operand:SI 5 "arm_add_operand" "rIL,lPy")]))) >>>>>> >>>>>> Can you please put the more restrictive alternatives (lPy) first? >>>>>> Same with the other patterns your patch touches. >>>>>> Ok with that change if a rebased testing run is ok. >>>>>> Sorry for the delay in reviewing. >>>>>> >>>>> OK, I will update my patch accordingly. >>>>> >>>>> However, when I discussed this with Ramana during the Cauldron, >>>>> he requested benchmark results. So far, I was able to run spec2006 >>>>> on an APM machine, and I'm seeing performance changes in the >>>>> range 11% improvement (465.tonto) to 7% degradation (433.milc). >>>>> >>>>> Would that be acceptable? >>>> >>>> >>>> Those sound like quite large swings. >>> >>> Indeed, but most are in the -1%-+1% range. >>> >>>> Are you sure the machine was not running anything else at the time >>>> or playing tricks with frequency scaling? >>> >>> No, I had no such guarantee. I used this machine temporarily, >>> first to check that bootstrap worked. I planed to use another >>> board with an A57 "standard" microarch for proper >>> benchmarking, but I'm not sure when I'll have access to it >>> (wrt to e/o gcc stage1), that's why I reported these early >>> figures. >>> >>>> Did all iterations of SPEC show a consistent difference? >>>> >>>> If the changes are consistent, could you have a look at the codegen >>>> to see if there are any clues to the differences? >>> >>> I will update my patch according to your comment, re-run the bench >>> and have a deeper look at the codegen differences. >>> >> I have finally been able to run benchmarks with my patch updated >> according to your comment, on new machines where we have >> better control of the environment (frequency, etc...). >> >> These machines use cortex-a57 CPUs and spec2006 shows little >> difference with and without this patch. >> >> Comparing several runs with and without the patch: >> - gcc is about 1% slower >> - povray about 1% faster >> - omnetpp about 1.5% faster > > > Yeah, that looks line with what I'd expect for this change. > >> The others are in the noise level. >> >> OK for trunk? > > > This is ok if a bootstrap and test run on top of a recent compiler > configured for --with-thumb > and for armv8-a passes okay (to exercise the new code). > Thanks for persevering with this! >
Thanks for the prompt approval. Committed after bootstrap+regtest on top of r252757. I'll take a look at the remaining similar warnings. Christophe > Kyrill > > >> >> Thanks, >> >> Christophe >> >> >>>> I'd like to get an explanation for these differences before committing >>>> this patch. If they are just an effect of the more restrictive >>>> constraints >>>> then there may be not much we can do, but I'd like to make sure there's >>>> not >>>> anything else unexpected going on. >>>> >>> Thanks, >>> >>> Christophe >>> >>>>> The number of warnings (IT blocks containing 32-bit Thumb instructions >>>>> are deprecated in ARMv8) >>>>> was 712 without my patch and 122 with it. (using the hosts's binutils >>>>> 2.24/ubuntu). >>>>> I expected some warning, since as I said earlier other patterns need >>>>> to be updated. >>>> >>>> >>>> Understood. That's fine. >>>> >>>> Thanks, >>>> Kyrill >>>> >>>> >>>>> Christophe >>>>> >>>>> >>>>>> Thanks for improving this area! >>>>>> Kyrill >>>>>> >