On Thu, Jan 26, 2017 at 3:56 PM, Andre Vieira (lists)
<andre.simoesdiasvie...@arm.com> wrote:
> On 20/01/17 14:08, Ramana Radhakrishnan wrote:
>> On Wed, Dec 28, 2016 at 9:58 AM, Andre Vieira (lists)
>> <andre.simoesdiasvie...@arm.com> wrote:
>>> On 29/11/16 09:45, Andre Vieira (lists) wrote:
>>>> On 17/11/16 10:00, Ramana Radhakrishnan wrote:
>>>>> On Thu, Oct 6, 2016 at 2:57 PM, Andre Vieira (lists)
>>>>> <andre.simoesdiasvie...@arm.com> wrote:
>>>>>> Hello,
>>>>>>
>>>>>> This patch tackles the issue reported in PR71607. This patch takes a
>>>>>> different approach for disabling the creation of literal pools. Instead
>>>>>> of disabling the patterns that would normally transform the rtl into
>>>>>> actual literal pools, it disables the creation of this literal pool rtl
>>>>>> by making the target hook TARGET_CANNOT_FORCE_CONST_MEM return true if
>>>>>> arm_disable_literal_pool is true. I added patterns to split floating
>>>>>> point constants for both SF and DFmode. A pattern to handle the
>>>>>> addressing of label_refs had to be included as well since all
>>>>>> "memory_operand" patterns are disabled when
>>>>>> TARGET_CANNOT_FORCE_CONST_MEM returns true. Also the pattern for
>>>>>> splitting 32-bit immediates had to be changed, it was not accepting
>>>>>> unsigned 32-bit unsigned integers with the MSB set. I believe
>>>>>> const_int_operand expects the mode of the operand to be set to VOIDmode
>>>>>> and not SImode. I have only changed it in the patterns that were
>>>>>> affecting this code, though I suggest looking into changing it in the
>>>>>> rest of the ARM backend.
>>>>>>
>>>>>> I added more test cases. No regressions for arm-none-eabi with
>>>>>> Cortex-M0, Cortex-M3 and Cortex-M7.
>>>>>>
>>>>>> Is this OK for trunk?
>>>>>
>>>>> Including -mslow-flash-data in your multilib flags ? If no regressions
>>>>> with that ok .
>>>>>
>>>>>
>>>>> regards
>>>>> Ramana
>>>>>
>>>>>>
>>>>
>>>> Hello,
>>>>
>>>> I found some new ICE's with the -mslow-flash-data testing so I had to
>>>> rework this patch. I took the opportunity to rebase it as well.
>>>>
>>>> The problem was with the way the old version of the patch handled label
>>>> references.  After some digging I found I wasn't using the right target
>>>> hook and so I implemented the 'TARGET_USE_BLOCKS_FOR_CONSTANT_P' for
>>>> ARM.  This target hook determines whether a literal pool ends up in an
>>>> 'object_block' structure. So I reverted the changes made in the old
>>>> version of the patch to the ARM implementation of the
>>>> 'TARGET_CANNOT_FORCE_CONST_MEM' hook and rely on
>>>> 'TARGET_USE_BLOCKS_FOR_CONSTANT_P' instead. This patch adds an ARM
>>>> implementation for this hook that returns false if
>>>> 'arm_disable_literal_pool' is set to true and true otherwise.
>>>>
>>>> This version of the patch also reverts back to using the check for
>>>> 'SYMBOL_REF' in 'thumb2_legitimate_address_p' that was removed in the
>>>> last version, this code is required to place the label references in
>>>> rodata sections.
>>>>
>>>> Another thing this patch does is revert the changes made to the 32-bit
>>>> constant split in arm.md. The reason this was needed before was because
>>>> 'real_to_target' returns a long array and does not sign-extend values in
>>>> it, which would make sense on hosts with 64-bit longs. To fix this the
>>>> value is now casted to 'int' first.  It would probably be a good idea to
>>>> change the 'real_to_target' function to return an array with
>>>> 'HOST_WIDE_INT' elements instead and either use all 64-bits or
>>>> sign-extend them.  Something for the future?
>>>>
>>>> I added more test cases in this patch and reran regression tests for:
>>>> Cortex-M0, Cortex-M4 with and without -mslow-flash-data. Also did a
>>>> bootstrap+regressions on arm-none-linux-gnueabihf.
>>>>
>>>> Is this OK for trunk?
>>>>
>>>> Cheers,
>>>> Andre
>>>>
>>>> gcc/ChangeLog:
>>>>
>>>> 2016-11-29  Andre Vieira  <andre.simoesdiasvie...@arm.com>
>>>>
>>>>     PR target/71607
>>>>     * config/arm/arm.md (use_literal_pool): Removes.
>>>>     (64-bit immediate split): No longer takes cost into consideration
>>>>     if 'arm_disable_literal_pool' is enabled.
>>>>     * config/arm/arm.c (arm_use_blocks_for_constant_p): New.
>>>>     (TARGET_USE_BLOCKS_FOR_CONSTANT_P): Define.
>>>>     (arm_max_const_double_inline_cost): Remove use of
>>>> arm_disable_literal_pool.
>>>>     * config/arm/vfp.md (no_literal_pool_df_immediate): New.
>>>>     (no_literal_pool_sf_immediate): New.
>>>>
>>>>
>>>> gcc/testsuite/ChangeLog:
>>>>
>>>> 2016-11-29  Andre Vieira  <andre.simoesdiasvie...@arm.com>
>>>>             Thomas Preud'homme  <thomas.preudho...@arm.com>
>>>>
>>>>     PR target/71607
>>>>     * gcc.target/arm/thumb2-slow-flash-data.c: Renamed to ...
>>>>     * gcc.target/arm/thumb2-slow-flash-data-1.c: ... this.
>>>>     * gcc.target/arm/thumb2-slow-flash-data-2.c: New.
>>>>     * gcc.target/arm/thumb2-slow-flash-data-3.c: New.
>>>>     * gcc.target/arm/thumb2-slow-flash-data-4.c: New.
>>>>     * gcc.target/arm/thumb2-slow-flash-data-5.c: New.
>>>>
>>> Hello,
>>>
>>> As I have mentioned in my commit message for the fix on embedded-6  (see
>>> https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01304.html) I found an
>>> issue with this code when testing its backport on the embedded-6-branch.
>>>
>>> I misread the definition of the target hook
>>> TARGET_USE_BLOCKS_FOR_CONSTANT_P and it seems the way I implemented it
>>> before only changed code generation if the -mslow-flash-data option
>>> wasn't passed. I.e. I don't need to implement it to solve this issue.
>>> The other changes in this patch are sufficient...
>>>
>>> I reran regressions for arm-none-eabi-gcc with the following targets:
>>> Cortex-M0, Cortex-M3, Cortex-M7 and ARMv8-M Baseline. I also ran
>>> bootstraps and full regressions for arm-none-linux-gnueabihf both in ARM
>>> and THUMB mode. All green!
>>
>> And presumably with -mslow-flash-data as a multilib again ?
>>
>>>
>>> Is this OK for trunk?
>>
>> We should also be adding to arm_reorg
>>
>>
>> if (arm_disable_literal_pool)
>>   return ;
>>
>> after we split all the insns.
>>
>>
>> After all if the splitters added had done their job, there's no reason
>> to run the minipool placement code at all - is there  :) ?
>>
>> Furthermore is it guaranteed that we will not have any references to
>> the literal pool post reload - IIRC , reload in its older days could
>> easily just push any old constant into a standard constant pool entry
>> and make things work . Is LRA likely not to produce any constant pool
>> entries in this form ? The bad thing here would be producing literal
>> pool entries when the user had asked the compiler to do so - thus
>> having more belts and braces for this would be good (thus , why is a
>> change to arm_cannot_force_const_mem to return false in the face of
>> arm_disable_literal_pool not appropriate ? ) Nick also had a couple of
>> good points on making the test more usable across all testers in
>> private conversation - so please make those changes.
>>
>>
>>
>>
>> regards
>> Ramana
>>
>>
> Hello,
>
> I forgot to mention I ran the full regression tests for -mcpu=cortex-m3
> with the -mslow-flash-data option, this showed up clean.
>
> After I disabled the creation of literal pools in arm_reorg if
> 'arm_disable_literal' pool is true the same test showed regressions with
> TLS tests. The compiler generates assembly that tries to load directly
> from a gott table symbol which is invalid assembly.  We do not have the
> required AArch32 TLS relocations for instructions.

Ok. That sounds reasonable.


>
> However, it might be worth mentioning that the GNU ARM Embedded
> Toolchain is configured to use TLS emulation and this code is thus not
> generated for toolchains configured with --disable-threads and
> --disable-tls.  This wrong code generation would thus only occur in the
> "niche" case of configuring a toolchain with threads and TLS, for a
> M-profile device whilst compiling code containing TLS variables using
> -mslow-flash-data.

Agreed -


I have seen other requests for a similar feature even on A profile,
thus documenting this in the form of comments in the source
would  be very welcome.

>
> As for your question whether a change to 'arm_cannot_force_const_mem' to
> return false in the face of 'arm_disable_literal_pool'.  I think you
> meant to say 'true' here.  I had tried this before, but this leads to
> arm_legitimate_constant_p always returning false, which prevents the use
> of constants altogether, even when synthesized into movw and movt's.
>

Yes I meant to say true  , jetlag ...

Did you try refactoring the existing logic into a helper function that
does what it does today and only return true for
target_slow_flash_data in the arm_cannot_force_const_mem interface ?

i.e.

arm_cannot_force_const_mem => arm_cannot_force_const_mem_1

arm_legitimate_constant_p calls arm_cannot_force_const_mem_1 ..
and

New definition of arm_cannot_force_const_mem returns true for
constants for target_flash_slow_data.

and then handles arm_cannot_force_const_mem_1 as it exists today

Can you experiment with that please ?

I'm happy with that as a follow-up patch.  I'm really keen on making
sure that the compiler does *not* produce any literal pools at all for
slow-flash-data and we fail hard in case we can't handle it. The
requirement is no literal pools and the compiler should enforce that
to the best of its ability and not paper over problems.



> The way we currently use the target hooks around literal pools should
> perhaps be reviewed, but that is a much bigger task. For instance, we
> should perhaps be using 'TARGET_USE_BLOCKS_FOR_CONSTANT_P', which reads:
> "This hook should return true if pool entries for constant x can be
> placed in an object_block structure.". We currently use the default
> implementation for this hook which returns false.

I'm not sure if that is appropriate here as this tries to put things
into constant pool blocks that are shared probably between many
functions which implies code is able to access the constant pool
anywhere. There is some tie in to section anchors if I remember
correctly. I don't think we can use this in the ARM backend because we
really need the minipool placement to work for constant pools. I think
that is orthogonal to this.

>
> I reran regressions for arm-none-eabi-gcc with the following targets:
> Cortex-M3, Cortex-M4 and Cortex-M7. I did not reran bootstraps as the
> change in this new version can only affect M-profile targets and could
> never change code-generation for other targets. Some of these tests are
> still running, I will not commit anything before they are done. Though I
> don't expect anything to change, since the -mslow-flash-data regression
> testing, see bellow, worked fine.
>
> I also reran the -mslow-flash-data for arm-none-eabi with Cortex-M3, and
> as I mentioned earlier depending on how GCC is configured it could
> regress for TLS tests.


Given that maybe we should just disable the use of TLS in
-mslow-flash-data and give an error ?

Ok with all the caveats as listed above.


Ramana


>
> Is this OK for trunk?


>
> Cheers,
> Andre

Reply via email to