Re: STM32H7 crash

Peter Barada Sat, 07 Feb 2026 14:28:12 -0800

Cranking up CONFIG_INIT_STACKSIZE to 3072 fixes the issue.

I tried enabling STACK_COLORATION, STACK_USAGE, and ARMV7M_STACKTRACEwhile leaving INIT_STACKSIZE at 2048 to hopefully and debug usingSTM32CubeIDE when I try "time ls" the GDB session is lost (which seemsstrange).

If I then enable ARMV7M_STACKCHECK_BREAKPOINT GDB stops when it detectsthe stack overflow can get a call stack to understand why but can'tcontinue(to show dump).

Finally after enabling ARCH_STACKDUMP, ARMV7M_STACKCHECK,SCHED_BACKTRACE, STACK_COLORATION, STACK_USAGE, disableSTACKCHECK_BREAKPOINT, and enable/set ARCH_INTERRUPTSTACK=2048, andARCH_STACKDUMP_MAX_LENGTH=1024, I get a full dump when it detects stackoverflow.


Thanks for the help!


On 2/7/26 03:25, raiden00pl wrote:

hi, this is a 100% stack issue. Increase all stack sizes to at least 4092.
Another option is to enable full optimisation with CONFIG_DEBUG_FULLOPT=y,
should also help.

quick tip: about 80% of crashes in NuttX are stack issues, the first thing
you
always do when such crashes occur is to increase all stack sizes :)

sob., 7 lut 2026 o 04:02 Matteo Golin <[email protected]> napisał(a):

I am not familiar enough, but there should be an option for stack canaries.
I haven't had much luck with that configuration, and I imagine that your
DEBUGASSERT will trigger before stack smashing is detected.

Matteo

On Fri, Feb 6, 2026, 8:45 PM Peter Barada <[email protected]> wrote:

Haven't tried yet(personally feel should know _why_ it happens) - is

there

a config for compiling in stack checking on function entry?
On 2/6/26 20:22, Matteo Golin wrote:

Hmmm, if the problem goes that far back it may not be worth triaging that
way. Things have probably diverged so much since then. No luck with the
stack increase?

Matteo

On Fri, Feb 6, 2026, 8:18 PM Peter Barada <[email protected]>

wrote:

Matteo,

I'm walking back release points and have had to change board
configuration names(to nucleo-h743zi), rename nuttx-apps to appa, and

still

seeing the fault in release/11.0 branch.

I'm trying to go back further but wondering if I'll find a bisect start
point...
On 2/6/26 17:05, Matteo Golin wrote:

Hi Peter,

My approach is kind of a headache since bisecting over an area where

apps

and NuttX are not always in sync is a major limitation of the split

repo.

My approach is usually:

- Start the bisect in kernel
- Check the commit date of the current HEAD
- Check out to a commit of the same/similar date in apps
- Build
- Mentally note if this commit was good or bad based on the results of
running the image
- make distclean (avoids artifacts carrying over between bisections and
breaking everything)
- Mark commit good or bad with git bisect

Then basically repeat this until bisecting is finished. It sucks and I
did suggest a script in /tools/ to try and automate most of this, but I
never got around to writing it.

I would suggest you start by checking for the issue on a stable release
(i.e. 12.12.0) to see if that's a good commit you can start from.

Usually

those releases have a higher degree of testing because everyone who

voted

for the release ran some images on their hardware.

That's honestly a lot of work but you never know if it'll end up being
faster than trying to triage with logs!

Matteo

On Fri, Feb 6, 2026, 4:50 PM Nathan Hartman <[email protected]>
wrote:

First place I would look: is the stack overflowing? (You could try
enabling some of the stack debugging features.)

On Fri, Feb 6, 2026 at 4:34 PM Peter Barada <[email protected]>
wrote:

Matteo,

I don't know if this was working before but if you can suggest a good
starting point I can cycle through git bisect to narrow down to the
failing commit.  What's the best approach to using git bisect across
multiple repos (since changes in nuttx may have necessary changes in
nuttx-apps and need to keep them in sync at each build point)?

As an aside, I also I have a nucleo-f446re board 'time ls' works fine
there.

Further, does anyone have GDB scripts that make it easier to decipher
Nuttx structures from memory (e.g. dump task/semaphore lists, etc)?
I've
started cobbling snippets but figure I'd ask before reinventing the
wheel.


On 2/6/26 16:12, Matteo Golin wrote:

Hi Peter,

If you happen to know that this was working before on an older NuttX
version, you could use git bisect to narrow down the breaking

commit.

Then the issue might be clearer.

Best,
Matteo

On Fri, Feb 6, 2026, 4:09 PM Peter Barada <[email protected]>

wrote:

     I have a STM32 Nucleo-h753zi board - and configured a build for
     nucleo-743zi2:nsh (which is closest board/chip; the stm32h753zi

is

     same
     as stm32h743zi but h753zi includes crypto acceleration

hardware).

     Build works, but if I boot and try 'time ls' nuttx faults:

     nsh> uname -a
     NuttX 0.0.0 9ecfff0833 Feb  6 2026 15:45:28 arm nucleo-h743zi2
     nsh> time ls
     /:
       dev/

     0.00dump_assert_info: Current Version: NuttX  0.0.0 9ecfff0833
     Feb  6 2026 15:45:28 arm
     dump_assert_info: Assertion failed panic: at file: :0 task:
     <noname> process: <noname> 0x800c9fd
     up_dump_register: R0: 0801e624 R1: 0000000a R2: 00000050  R3:

0000000a

     up_dump_register: R4: 00000001 R5: 240000e4 R6: 00000000  FP:

00000000

     up_dump_register: R8: 00000000 SB: 00000000 SL: 00000000 R11:

00000000

     up_dump_register: IP: 00000000 SP: 38000c08 LR: 080059db  PC:

08005984

     up_dump_register: xPSR: 41000000 BASEPRI: 00000000 CONTROL:

00000000

     up_dump_register: EXC_RETURN: ffffffe9
     dump_stackinfo: User Stack:
     dump_stackinfo:   base: 0x38000518
     dump_stackinfo:   size: 00002000
     dump_stackinfo:     sp: 0x38000c08
     stack_dump: 0x38000be8: 00000000 00000000 00000000 00000000
     00000000 00000000 00000000 00000000
     stack_dump: 0x38000c08: 0000000a 0801e624 0801e624 38000200
     38000fac 00000000 0801e624 080172c1
     stack_dump: 0x38000c28: 00000000 0801e624 38000200 38000158
     00000000 00000000 38000fac 0800caa1
     stack_dump: 0x38000c48: 00000000 0800cc77 0801e624 000002fc
     38000500 00000001 00000001 38000cf0
     stack_dump: 0x38000c68: 38000cf0 00000008 38000200 00000000
     00000000 0800ca79 38000500 00000001
     stack_dump: 0x38000c88: 00000064 38000cf0 00000064 0800ca33
     38000500 00000001 00000064 00000000
     stack_dump: 0x38000ca8: 00000000 08009325 00000000 38000500
     00000001 0800c9fd 00000000 080052f1
     stack_dump: 0x38000cc8: 00000000 38000500 00000000 38000158
     00000001 00000001 00000000 00000000
     stack_dump: 0x38000ce8: 00000000 00000000 00000000 00000000
     00000000 00000000 00000000 00000000
     dump_tasks:    PID GROUP PRI POLICY   TYPE    NPX STATE  EVENT
       SIGMASK          STACKBASE  STACKSIZE   COMMAND
     dump_task:       0     0   0 FIFO     Kthread -   Ready
     0000000000000000 0x240018b0      1000   <noname>
     dump_task:       1     1 100 RR       Task    -   Running
     0000000000000000 0x38000518      2000   <noname> ��]���&

     Wondering if anyone has run across this before?  Backtrace

shows:

     Program received signal SIGTRAP, Trace/breakpoint trap.
     exception_common () at armv7-m/arm_exception.S:127
     127             mrs             r0, ipsr           /*

R0=exception

     number */
     where
     #0  exception_common () at armv7-m/arm_exception.S:127
     #1  <signal handler called>
     #2  0x08005984 in env_cmpname (pszname=0x801e624 "PS1",
          peqname=0xa <error: Cannot access memory at address 0xa>)
          at environ/env_findvar.c:50
     #3  0x080059da in env_findvar (group=0x38000200, pname=0x801e624
     "PS1")
          at environ/env_findvar.c:105
     #4  0x080172c0 in getenv (name=0x801e624 "PS1") at
     environ/env_getenv.c:89
     #5  0x0800caa0 in nsh_update_prompt () at nsh_prompt.c:77
     #6  0x0800cc76 in nsh_session (pstate=0x38000cf0, login=1,

argc=1,

          argv=0x38000500) at nsh_session.c:249
     #7  0x0800ca78 in nsh_consolemain (argc=1, argv=0x38000500)
          at nsh_consolemain.c:77
     #8  0x0800ca32 in nsh_main (argc=1, argv=0x38000500) at nsh_

main.c:76

     #9  0x08009324 in nxtask_startup (entrypt=0x800c9fd <nsh_main>,
     argc=1,
          argv=0x38000500) at sched/task_startup.c:72
     #10 0x080052f0 in nxtask_start () at task/task_start.c:104
     #11 0x00000000 in ?? ()

     Scratching the surface shows that env_findvar() is called with

group

     pointer of 0x38000200, group->tg_envp is 0x380004b8, both which

are

     reasonable. But *group->tg_envp is 0xA.  Further if I "watch
     *(int*)0x380004b8" in GDB, I see it is getting overwritten by
     up_serialout() invoked from stm32_serial.c::up_send.

     Any suggestions on how I can best track this down further?

     Thanks in advance!

     --
     Peter Barada
     [email protected]

--
Peter Barada
[email protected]

--

Peter [email protected]

--

Peter [email protected]

--
Peter Barada
[email protected]

Re: STM32H7 crash

Reply via email to