As it seems, adding the flag -ffixed-r10 solves my initial problem. The
application runs normally, and detects any overflows.

However, indeed __stack_overflow_trap is misbehaving. Although it is
declared with __attribute__((naked, no_instrument_function)), it seems that
GCC is not respecting it.

GCC does try to call __cyg_profile_func_enter while entering
__stack_overflow_trap, causing an infinite loop.
See attached the disassembled code. At 0x800221a it branches to
__cyg_profile_func_enter while it shouldn't.

[image: __stack_overflow_trap.png]





Στις Πέμ, 8 Οκτ 2020 στις 1:39 π.μ., ο/η Fotis Panagiotopoulos <
f.j.pa...@gmail.com> έγραψε:

> I just spent some time on this.
> It is a false alarm, and seems not related to #1900.
>
> It only happens if I have optimizations enabled. On some builds it also
> happens on other functions than the aforementioned one.
>
> It is caused by R10 contents getting corrupted. Then I realized that I
> need to add the flag -ffixed-r10.
> (This is not mentioned in help of Kconfig, it only mentions that I have to
> add -finstrument-functions. Can someone who knows about this mechanism
> update the Kconfig help entry?)
>
> Adding the missing flag however, did not solve my issues. Code restarts at
> a point within __start, before OS initialization takes place.
> There is no apparent reason for the moment. I will try to track this down
> tomorrow.
>
>
> Στις Τετ, 7 Οκτ 2020 στις 2:03 μ.μ., ο/η David Sidrane <
> david.sidr...@nscdg.com> έγραψε:
>
>> Hi Fotis,
>>
>>
>> >NuttX is not able to boot at all with this option selected. I stepped
>> >through the code and it seems that nxsig_initialize actually causes a
>> stack
>> >overflow that is detected by the above check.
>> >Is this a bug, or I should configure something in a different way?
>>
>> Have you seen
>>
>> https://cwiki.apache.org/confluence/display/NUTTX/ARMv7-M+Run+Time+Stack+Checking
>>
>> We have this working in PX4.  'make px4_fmuv5_stackcheck' as of NuttX
>> master
>> a few weeks ago You can see it passing on the hardware test rack
>>
>> the last entry is px4_fmu-v5_stackcheck
>>
>>
>> http://ci.px4.io:8080/blue/organizations/jenkins/PX4_misc%2FFirmware-hardware/detail/master/2374/pipeline
>>
>> It is most likely a configuration and build issue. (Albet the SW stack
>> coloring is broken on master and under repair curretly)
>>
>> The __stack_overflow_trap should carry  __attribute__
>> ((no_instrument_function))
>>
>> Also R10 has to be preserved.
>>
>> All files built in the build have to have the same
>> CONFIG_ARMV7M_STACKCHECK
>> settings.
>>
>> The margin is artificially set.
>>
>> Are you using a separate interrupt stack?
>>
>> With a lot of the new changes, we can assume the init path's nesting level
>> has not been kept small.
>>
>> Increase the idle stack size by a lot (1024) and retest.
>>
>>
>> David
>>
>> -----Original Message-----
>> From: Fotis Panagiotopoulos [mailto:f.j.pa...@gmail.com]
>> Sent: Tuesday, October 06, 2020 12:40 PM
>> To: dev@nuttx.apache.org
>> Subject: Stack overflow during system init.
>>
>> Hi everyone,
>>
>> I just enabled CONFIG_ARMV7M_STACKCHECK, as I would like to have this
>> functionality in my project, but I am facing problems.
>>
>> NuttX is not able to boot at all with this option selected. I stepped
>> through the code and it seems that nxsig_initialize actually causes a
>> stack
>> overflow that is detected by the above check.
>> Is this a bug, or I should configure something in a different way?
>>
>> Then I realized that __stack_overflow_trap is broken.
>> When a stack overflow happens, this function is called which is supposed
>> to
>> cause a hardfault. However as __stack_overflow_trap is a function itself,
>> __cyg_profile_func_enter is called again. Once again it detects the
>> overflow and calls __stack_overflow_trap, and so on...
>>
>> Finally, as I see, the stack check is performed while entering a function,
>> which is wrong. If there is a stack overflow, it will be detected at the
>> next function call, which may be at a irrelevant part of the code. I
>> believe that the check shall be performed on the exit of a function, in
>> which case you will be sure that this specific function caused the
>> overflow. And of course it will solve the issue above with
>> __stack_overflow_trap.
>>
>

Reply via email to