Re: STM32H7 crash

Nathan Hartman Sun, 08 Feb 2026 13:58:26 -0800

Hi Peter,

That is interesting (and strange) indeed. IIRC the only difference between
those two chips is that the 753 has built-in crypto accelerators while the
743 does not. I believe that a firmware image built for one will work
correctly on the other (provided obviously that the firmware does not
attempt to access the crypto accelerators).


Did you make a separate build for each chip?

Or did you flash an *identical* image to both boards with stack size = 2048
and the same image succeeded on the 753 and failed on the 743?

I'm asking because if it's an identical image, that would require a quite
different debugging strategy than if it was a separate build for each chip.

Thanks,
Nathan

On Sun, Feb 8, 2026 at 3:11 PM Peter Barada <[email protected]> wrote:

> Nathan,
>
> What's strange is that same master source (nuttx hash
> e83606732d5e71eb98a9eb544537dbbeb71aa58b, apps hash
> d48b45000d1d083082f7a1650f351573c36a87d0) with INIT_STACKSIZE=2048 in the
> default .config fails on nucleo-h743zi2 but passes on nucleo-h743zi2(run on
> my nuclo-h753zi board) when I try "time ls".  I turned on all the stack
> checks just to be sure nuclo-f446re wasn't just "lucky".
> On 2/7/26 23:54, Nathan Hartman wrote:
>
> Yeah, it's usually the stack, but does anyone know why it needs to be
> enlarged now? Is something using more stack than before?
>
> On Sat, Feb 7, 2026 at 5:28 PM Peter Barada <[email protected]>
> wrote:
>
>> Cranking up CONFIG_INIT_STACKSIZE to 3072 fixes the issue.
>>
>> I tried enabling STACK_COLORATION, STACK_USAGE, and ARMV7M_STACKTRACE
>> while leaving INIT_STACKSIZE at 2048 to hopefully and debug using
>> STM32CubeIDE when I try "time ls" the GDB session is lost (which seems
>> strange).
>>
>> If I then enable ARMV7M_STACKCHECK_BREAKPOINT GDB stops when it detects
>> the stack overflow can get a call stack to understand why but can't
>> continue(to show dump).
>>
>> Finally after enabling ARCH_STACKDUMP, ARMV7M_STACKCHECK,
>> SCHED_BACKTRACE, STACK_COLORATION, STACK_USAGE, disable
>> STACKCHECK_BREAKPOINT, and enable/set ARCH_INTERRUPTSTACK=2048, and
>> ARCH_STACKDUMP_MAX_LENGTH=1024, I get a full dump when it detects stack
>> overflow.
>>
>> Thanks for the help!
>>
>>
>> On 2/7/26 03:25, raiden00pl wrote:
>> > hi, this is a 100% stack issue. Increase all stack sizes to at least
>> 4092.
>> > Another option is to enable full optimisation with
>> CONFIG_DEBUG_FULLOPT=y,
>> > should also help.
>> >
>> > quick tip: about 80% of crashes in NuttX are stack issues, the first
>> thing
>> > you
>> > always do when such crashes occur is to increase all stack sizes :)
>> >
>> > sob., 7 lut 2026 o 04:02 Matteo Golin <[email protected]>
>> napisał(a):
>> >
>> >> I am not familiar enough, but there should be an option for stack
>> canaries.
>> >> I haven't had much luck with that configuration, and I imagine that
>> your
>> >> DEBUGASSERT will trigger before stack smashing is detected.
>> >>
>> >> Matteo
>> >>
>> >> On Fri, Feb 6, 2026, 8:45 PM Peter Barada <[email protected]>
>> wrote:
>> >>
>> >>> Haven't tried yet(personally feel should know _why_ it happens) - is
>> >> there
>> >>> a config for compiling in stack checking on function entry?
>> >>> On 2/6/26 20:22, Matteo Golin wrote:
>> >>>
>> >>> Hmmm, if the problem goes that far back it may not be worth triaging
>> that
>> >>> way. Things have probably diverged so much since then. No luck with
>> the
>> >>> stack increase?
>> >>>
>> >>> Matteo
>> >>>
>> >>> On Fri, Feb 6, 2026, 8:18 PM Peter Barada <[email protected]>
>> >> wrote:
>> >>>> Matteo,
>> >>>>
>> >>>> I'm walking back release points and have had to change board
>> >>>> configuration names(to nucleo-h743zi), rename nuttx-apps to appa, and
>> >> still
>> >>>> seeing the fault in release/11.0 branch.
>> >>>>
>> >>>> I'm trying to go back further but wondering if I'll find a bisect
>> start
>> >>>> point...
>> >>>> On 2/6/26 17:05, Matteo Golin wrote:
>> >>>>
>> >>>> Hi Peter,
>> >>>>
>> >>>> My approach is kind of a headache since bisecting over an area where
>> >> apps
>> >>>> and NuttX are not always in sync is a major limitation of the split
>> >> repo.
>> >>>> My approach is usually:
>> >>>>
>> >>>> - Start the bisect in kernel
>> >>>> - Check the commit date of the current HEAD
>> >>>> - Check out to a commit of the same/similar date in apps
>> >>>> - Build
>> >>>> - Mentally note if this commit was good or bad based on the results
>> of
>> >>>> running the image
>> >>>> - make distclean (avoids artifacts carrying over between bisections
>> and
>> >>>> breaking everything)
>> >>>> - Mark commit good or bad with git bisect
>> >>>>
>> >>>> Then basically repeat this until bisecting is finished. It sucks and
>> I
>> >>>> did suggest a script in /tools/ to try and automate most of this,
>> but I
>> >>>> never got around to writing it.
>> >>>>
>> >>>> I would suggest you start by checking for the issue on a stable
>> release
>> >>>> (i.e. 12.12.0) to see if that's a good commit you can start from.
>> >> Usually
>> >>>> those releases have a higher degree of testing because everyone who
>> >> voted
>> >>>> for the release ran some images on their hardware.
>> >>>>
>> >>>> That's honestly a lot of work but you never know if it'll end up
>> being
>> >>>> faster than trying to triage with logs!
>> >>>>
>> >>>> Matteo
>> >>>>
>> >>>> On Fri, Feb 6, 2026, 4:50 PM Nathan Hartman <
>> [email protected]>
>> >>>> wrote:
>> >>>>
>> >>>>> First place I would look: is the stack overflowing? (You could try
>> >>>>> enabling some of the stack debugging features.)
>> >>>>>
>> >>>>> On Fri, Feb 6, 2026 at 4:34 PM Peter Barada <[email protected]
>> >
>> >>>>> wrote:
>> >>>>>
>> >>>>>> Matteo,
>> >>>>>>
>> >>>>>> I don't know if this was working before but if you can suggest a
>> good
>> >>>>>> starting point I can cycle through git bisect to narrow down to the
>> >>>>>> failing commit.  What's the best approach to using git bisect
>> across
>> >>>>>> multiple repos (since changes in nuttx may have necessary changes
>> in
>> >>>>>> nuttx-apps and need to keep them in sync at each build point)?
>> >>>>>>
>> >>>>>> As an aside, I also I have a nucleo-f446re board 'time ls' works
>> fine
>> >>>>>> there.
>> >>>>>>
>> >>>>>> Further, does anyone have GDB scripts that make it easier to
>> decipher
>> >>>>>> Nuttx structures from memory (e.g. dump task/semaphore lists, etc)?
>> >>>>>> I've
>> >>>>>> started cobbling snippets but figure I'd ask before reinventing the
>> >>>>>> wheel.
>> >>>>>>
>> >>>>>>
>> >>>>>> On 2/6/26 16:12, Matteo Golin wrote:
>> >>>>>>> Hi Peter,
>> >>>>>>>
>> >>>>>>> If you happen to know that this was working before on an older
>> NuttX
>> >>>>>>> version, you could use git bisect to narrow down the breaking
>> >> commit.
>> >>>>>>> Then the issue might be clearer.
>> >>>>>>>
>> >>>>>>> Best,
>> >>>>>>> Matteo
>> >>>>>>>
>> >>>>>>> On Fri, Feb 6, 2026, 4:09 PM Peter Barada <[email protected]
>> >
>> >>>>>> wrote:
>> >>>>>>>      I have a STM32 Nucleo-h753zi board - and configured a build
>> for
>> >>>>>>>      nucleo-743zi2:nsh (which is closest board/chip; the
>> stm32h753zi
>> >>>>>> is
>> >>>>>>>      same
>> >>>>>>>      as stm32h743zi but h753zi includes crypto acceleration
>> >> hardware).
>> >>>>>>>      Build works, but if I boot and try 'time ls' nuttx faults:
>> >>>>>>>
>> >>>>>>>      nsh> uname -a
>> >>>>>>>      NuttX 0.0.0 9ecfff0833 Feb  6 2026 15:45:28 arm
>> nucleo-h743zi2
>> >>>>>>>      nsh> time ls
>> >>>>>>>      /:
>> >>>>>>>        dev/
>> >>>>>>>
>> >>>>>>>      0.00dump_assert_info: Current Version: NuttX  0.0.0
>> 9ecfff0833
>> >>>>>>>      Feb  6 2026 15:45:28 arm
>> >>>>>>>      dump_assert_info: Assertion failed panic: at file: :0 task:
>> >>>>>>>      <noname> process: <noname> 0x800c9fd
>> >>>>>>>      up_dump_register: R0: 0801e624 R1: 0000000a R2: 00000050  R3:
>> >>>>>> 0000000a
>> >>>>>>>      up_dump_register: R4: 00000001 R5: 240000e4 R6: 00000000  FP:
>> >>>>>> 00000000
>> >>>>>>>      up_dump_register: R8: 00000000 SB: 00000000 SL: 00000000 R11:
>> >>>>>> 00000000
>> >>>>>>>      up_dump_register: IP: 00000000 SP: 38000c08 LR: 080059db  PC:
>> >>>>>> 08005984
>> >>>>>>>      up_dump_register: xPSR: 41000000 BASEPRI: 00000000 CONTROL:
>> >>>>>> 00000000
>> >>>>>>>      up_dump_register: EXC_RETURN: ffffffe9
>> >>>>>>>      dump_stackinfo: User Stack:
>> >>>>>>>      dump_stackinfo:   base: 0x38000518
>> >>>>>>>      dump_stackinfo:   size: 00002000
>> >>>>>>>      dump_stackinfo:     sp: 0x38000c08
>> >>>>>>>      stack_dump: 0x38000be8: 00000000 00000000 00000000 00000000
>> >>>>>>>      00000000 00000000 00000000 00000000
>> >>>>>>>      stack_dump: 0x38000c08: 0000000a 0801e624 0801e624 38000200
>> >>>>>>>      38000fac 00000000 0801e624 080172c1
>> >>>>>>>      stack_dump: 0x38000c28: 00000000 0801e624 38000200 38000158
>> >>>>>>>      00000000 00000000 38000fac 0800caa1
>> >>>>>>>      stack_dump: 0x38000c48: 00000000 0800cc77 0801e624 000002fc
>> >>>>>>>      38000500 00000001 00000001 38000cf0
>> >>>>>>>      stack_dump: 0x38000c68: 38000cf0 00000008 38000200 00000000
>> >>>>>>>      00000000 0800ca79 38000500 00000001
>> >>>>>>>      stack_dump: 0x38000c88: 00000064 38000cf0 00000064 0800ca33
>> >>>>>>>      38000500 00000001 00000064 00000000
>> >>>>>>>      stack_dump: 0x38000ca8: 00000000 08009325 00000000 38000500
>> >>>>>>>      00000001 0800c9fd 00000000 080052f1
>> >>>>>>>      stack_dump: 0x38000cc8: 00000000 38000500 00000000 38000158
>> >>>>>>>      00000001 00000001 00000000 00000000
>> >>>>>>>      stack_dump: 0x38000ce8: 00000000 00000000 00000000 00000000
>> >>>>>>>      00000000 00000000 00000000 00000000
>> >>>>>>>      dump_tasks:    PID GROUP PRI POLICY   TYPE    NPX STATE
>> EVENT
>> >>>>>>>        SIGMASK          STACKBASE  STACKSIZE   COMMAND
>> >>>>>>>      dump_task:       0     0   0 FIFO     Kthread -   Ready
>> >>>>>>>      0000000000000000 0x240018b0      1000   <noname>
>> >>>>>>>      dump_task:       1     1 100 RR       Task    -   Running
>> >>>>>>>      0000000000000000 0x38000518      2000   <noname> ��]���&
>> >>>>>>>
>> >>>>>>>      Wondering if anyone has run across this before?  Backtrace
>> >> shows:
>> >>>>>>>      Program received signal SIGTRAP, Trace/breakpoint trap.
>> >>>>>>>      exception_common () at armv7-m/arm_exception.S:127
>> >>>>>>>      127             mrs             r0, ipsr           /*
>> >> R0=exception
>> >>>>>>>      number */
>> >>>>>>>      where
>> >>>>>>>      #0  exception_common () at armv7-m/arm_exception.S:127
>> >>>>>>>      #1  <signal handler called>
>> >>>>>>>      #2  0x08005984 in env_cmpname (pszname=0x801e624 "PS1",
>> >>>>>>>           peqname=0xa <error: Cannot access memory at address
>> 0xa>)
>> >>>>>>>           at environ/env_findvar.c:50
>> >>>>>>>      #3  0x080059da in env_findvar (group=0x38000200,
>> pname=0x801e624
>> >>>>>>>      "PS1")
>> >>>>>>>           at environ/env_findvar.c:105
>> >>>>>>>      #4  0x080172c0 in getenv (name=0x801e624 "PS1") at
>> >>>>>>>      environ/env_getenv.c:89
>> >>>>>>>      #5  0x0800caa0 in nsh_update_prompt () at nsh_prompt.c:77
>> >>>>>>>      #6  0x0800cc76 in nsh_session (pstate=0x38000cf0, login=1,
>> >> argc=1,
>> >>>>>>>           argv=0x38000500) at nsh_session.c:249
>> >>>>>>>      #7  0x0800ca78 in nsh_consolemain (argc=1, argv=0x38000500)
>> >>>>>>>           at nsh_consolemain.c:77
>> >>>>>>>      #8  0x0800ca32 in nsh_main (argc=1, argv=0x38000500) at nsh_
>> >>>>>> main.c:76
>> >>>>>>>      #9  0x08009324 in nxtask_startup (entrypt=0x800c9fd
>> <nsh_main>,
>> >>>>>>>      argc=1,
>> >>>>>>>           argv=0x38000500) at sched/task_startup.c:72
>> >>>>>>>      #10 0x080052f0 in nxtask_start () at task/task_start.c:104
>> >>>>>>>      #11 0x00000000 in ?? ()
>> >>>>>>>
>> >>>>>>>      Scratching the surface shows that env_findvar() is called
>> with
>> >>>>>> group
>> >>>>>>>      pointer of 0x38000200, group->tg_envp is 0x380004b8, both
>> which
>> >>>>>> are
>> >>>>>>>      reasonable. But *group->tg_envp is 0xA.  Further if I "watch
>> >>>>>>>      *(int*)0x380004b8" in GDB, I see it is getting overwritten by
>> >>>>>>>      up_serialout() invoked from stm32_serial.c::up_send.
>> >>>>>>>
>> >>>>>>>      Any suggestions on how I can best track this down further?
>> >>>>>>>
>> >>>>>>>      Thanks in advance!
>> >>>>>>>
>> >>>>>>>      --
>> >>>>>>>      Peter Barada
>> >>>>>>>      [email protected]
>> >>>>>>>
>> >>>>>> --
>> >>>>>> Peter Barada
>> >>>>>> [email protected]
>> >>>>>>
>> >>>>> --
>> >>>> Peter [email protected]
>> >>>>
>> >>>> --
>> >>> Peter [email protected]
>> >>>
>> >>>
>> --
>> Peter Barada
>> [email protected]
>>
>> --
> Peter [email protected]
>
>

Re: STM32H7 crash

Reply via email to