Re: STM32H7 crash

raiden00pl Sat, 07 Feb 2026 00:23:45 -0800

hi, this is a 100% stack issue. Increase all stack sizes to at least 4092.
Another option is to enable full optimisation with CONFIG_DEBUG_FULLOPT=y,
should also help.


quick tip: about 80% of crashes in NuttX are stack issues, the first thing
you
always do when such crashes occur is to increase all stack sizes :)

sob., 7 lut 2026 o 04:02 Matteo Golin <[email protected]> napisał(a):

> I am not familiar enough, but there should be an option for stack canaries.
> I haven't had much luck with that configuration, and I imagine that your
> DEBUGASSERT will trigger before stack smashing is detected.
>
> Matteo
>
> On Fri, Feb 6, 2026, 8:45 PM Peter Barada <[email protected]> wrote:
>
> > Haven't tried yet(personally feel should know _why_ it happens) - is
> there
> > a config for compiling in stack checking on function entry?
> > On 2/6/26 20:22, Matteo Golin wrote:
> >
> > Hmmm, if the problem goes that far back it may not be worth triaging that
> > way. Things have probably diverged so much since then. No luck with the
> > stack increase?
> >
> > Matteo
> >
> > On Fri, Feb 6, 2026, 8:18 PM Peter Barada <[email protected]>
> wrote:
> >
> >> Matteo,
> >>
> >> I'm walking back release points and have had to change board
> >> configuration names(to nucleo-h743zi), rename nuttx-apps to appa, and
> still
> >> seeing the fault in release/11.0 branch.
> >>
> >> I'm trying to go back further but wondering if I'll find a bisect start
> >> point...
> >> On 2/6/26 17:05, Matteo Golin wrote:
> >>
> >> Hi Peter,
> >>
> >> My approach is kind of a headache since bisecting over an area where
> apps
> >> and NuttX are not always in sync is a major limitation of the split
> repo.
> >> My approach is usually:
> >>
> >> - Start the bisect in kernel
> >> - Check the commit date of the current HEAD
> >> - Check out to a commit of the same/similar date in apps
> >> - Build
> >> - Mentally note if this commit was good or bad based on the results of
> >> running the image
> >> - make distclean (avoids artifacts carrying over between bisections and
> >> breaking everything)
> >> - Mark commit good or bad with git bisect
> >>
> >> Then basically repeat this until bisecting is finished. It sucks and I
> >> did suggest a script in /tools/ to try and automate most of this, but I
> >> never got around to writing it.
> >>
> >> I would suggest you start by checking for the issue on a stable release
> >> (i.e. 12.12.0) to see if that's a good commit you can start from.
> Usually
> >> those releases have a higher degree of testing because everyone who
> voted
> >> for the release ran some images on their hardware.
> >>
> >> That's honestly a lot of work but you never know if it'll end up being
> >> faster than trying to triage with logs!
> >>
> >> Matteo
> >>
> >> On Fri, Feb 6, 2026, 4:50 PM Nathan Hartman <[email protected]>
> >> wrote:
> >>
> >>> First place I would look: is the stack overflowing? (You could try
> >>> enabling some of the stack debugging features.)
> >>>
> >>> On Fri, Feb 6, 2026 at 4:34 PM Peter Barada <[email protected]>
> >>> wrote:
> >>>
> >>>> Matteo,
> >>>>
> >>>> I don't know if this was working before but if you can suggest a good
> >>>> starting point I can cycle through git bisect to narrow down to the
> >>>> failing commit.  What's the best approach to using git bisect across
> >>>> multiple repos (since changes in nuttx may have necessary changes in
> >>>> nuttx-apps and need to keep them in sync at each build point)?
> >>>>
> >>>> As an aside, I also I have a nucleo-f446re board 'time ls' works fine
> >>>> there.
> >>>>
> >>>> Further, does anyone have GDB scripts that make it easier to decipher
> >>>> Nuttx structures from memory (e.g. dump task/semaphore lists, etc)?
> >>>> I've
> >>>> started cobbling snippets but figure I'd ask before reinventing the
> >>>> wheel.
> >>>>
> >>>>
> >>>> On 2/6/26 16:12, Matteo Golin wrote:
> >>>> > Hi Peter,
> >>>> >
> >>>> > If you happen to know that this was working before on an older NuttX
> >>>> > version, you could use git bisect to narrow down the breaking
> commit.
> >>>> > Then the issue might be clearer.
> >>>> >
> >>>> > Best,
> >>>> > Matteo
> >>>> >
> >>>> > On Fri, Feb 6, 2026, 4:09 PM Peter Barada <[email protected]>
> >>>> wrote:
> >>>> >
> >>>> >     I have a STM32 Nucleo-h753zi board - and configured a build for
> >>>> >     nucleo-743zi2:nsh (which is closest board/chip; the stm32h753zi
> >>>> is
> >>>> >     same
> >>>> >     as stm32h743zi but h753zi includes crypto acceleration
> hardware).
> >>>> >
> >>>> >     Build works, but if I boot and try 'time ls' nuttx faults:
> >>>> >
> >>>> >     nsh> uname -a
> >>>> >     NuttX 0.0.0 9ecfff0833 Feb  6 2026 15:45:28 arm nucleo-h743zi2
> >>>> >     nsh> time ls
> >>>> >     /:
> >>>> >       dev/
> >>>> >
> >>>> >     0.00dump_assert_info: Current Version: NuttX  0.0.0 9ecfff0833
> >>>> >     Feb  6 2026 15:45:28 arm
> >>>> >     dump_assert_info: Assertion failed panic: at file: :0 task:
> >>>> >     <noname> process: <noname> 0x800c9fd
> >>>> >     up_dump_register: R0: 0801e624 R1: 0000000a R2: 00000050  R3:
> >>>> 0000000a
> >>>> >     up_dump_register: R4: 00000001 R5: 240000e4 R6: 00000000  FP:
> >>>> 00000000
> >>>> >     up_dump_register: R8: 00000000 SB: 00000000 SL: 00000000 R11:
> >>>> 00000000
> >>>> >     up_dump_register: IP: 00000000 SP: 38000c08 LR: 080059db  PC:
> >>>> 08005984
> >>>> >     up_dump_register: xPSR: 41000000 BASEPRI: 00000000 CONTROL:
> >>>> 00000000
> >>>> >     up_dump_register: EXC_RETURN: ffffffe9
> >>>> >     dump_stackinfo: User Stack:
> >>>> >     dump_stackinfo:   base: 0x38000518
> >>>> >     dump_stackinfo:   size: 00002000
> >>>> >     dump_stackinfo:     sp: 0x38000c08
> >>>> >     stack_dump: 0x38000be8: 00000000 00000000 00000000 00000000
> >>>> >     00000000 00000000 00000000 00000000
> >>>> >     stack_dump: 0x38000c08: 0000000a 0801e624 0801e624 38000200
> >>>> >     38000fac 00000000 0801e624 080172c1
> >>>> >     stack_dump: 0x38000c28: 00000000 0801e624 38000200 38000158
> >>>> >     00000000 00000000 38000fac 0800caa1
> >>>> >     stack_dump: 0x38000c48: 00000000 0800cc77 0801e624 000002fc
> >>>> >     38000500 00000001 00000001 38000cf0
> >>>> >     stack_dump: 0x38000c68: 38000cf0 00000008 38000200 00000000
> >>>> >     00000000 0800ca79 38000500 00000001
> >>>> >     stack_dump: 0x38000c88: 00000064 38000cf0 00000064 0800ca33
> >>>> >     38000500 00000001 00000064 00000000
> >>>> >     stack_dump: 0x38000ca8: 00000000 08009325 00000000 38000500
> >>>> >     00000001 0800c9fd 00000000 080052f1
> >>>> >     stack_dump: 0x38000cc8: 00000000 38000500 00000000 38000158
> >>>> >     00000001 00000001 00000000 00000000
> >>>> >     stack_dump: 0x38000ce8: 00000000 00000000 00000000 00000000
> >>>> >     00000000 00000000 00000000 00000000
> >>>> >     dump_tasks:    PID GROUP PRI POLICY   TYPE    NPX STATE  EVENT
> >>>> >       SIGMASK          STACKBASE  STACKSIZE   COMMAND
> >>>> >     dump_task:       0     0   0 FIFO     Kthread -   Ready
> >>>> >     0000000000000000 0x240018b0      1000   <noname>
> >>>> >     dump_task:       1     1 100 RR       Task    -   Running
> >>>> >     0000000000000000 0x38000518      2000   <noname> ��]���&
> >>>> >
> >>>> >     Wondering if anyone has run across this before?  Backtrace
> shows:
> >>>> >
> >>>> >     Program received signal SIGTRAP, Trace/breakpoint trap.
> >>>> >     exception_common () at armv7-m/arm_exception.S:127
> >>>> >     127             mrs             r0, ipsr           /*
> R0=exception
> >>>> >     number */
> >>>> >     where
> >>>> >     #0  exception_common () at armv7-m/arm_exception.S:127
> >>>> >     #1  <signal handler called>
> >>>> >     #2  0x08005984 in env_cmpname (pszname=0x801e624 "PS1",
> >>>> >          peqname=0xa <error: Cannot access memory at address 0xa>)
> >>>> >          at environ/env_findvar.c:50
> >>>> >     #3  0x080059da in env_findvar (group=0x38000200, pname=0x801e624
> >>>> >     "PS1")
> >>>> >          at environ/env_findvar.c:105
> >>>> >     #4  0x080172c0 in getenv (name=0x801e624 "PS1") at
> >>>> >     environ/env_getenv.c:89
> >>>> >     #5  0x0800caa0 in nsh_update_prompt () at nsh_prompt.c:77
> >>>> >     #6  0x0800cc76 in nsh_session (pstate=0x38000cf0, login=1,
> argc=1,
> >>>> >          argv=0x38000500) at nsh_session.c:249
> >>>> >     #7  0x0800ca78 in nsh_consolemain (argc=1, argv=0x38000500)
> >>>> >          at nsh_consolemain.c:77
> >>>> >     #8  0x0800ca32 in nsh_main (argc=1, argv=0x38000500) at nsh_
> >>>> main.c:76
> >>>> >     #9  0x08009324 in nxtask_startup (entrypt=0x800c9fd <nsh_main>,
> >>>> >     argc=1,
> >>>> >          argv=0x38000500) at sched/task_startup.c:72
> >>>> >     #10 0x080052f0 in nxtask_start () at task/task_start.c:104
> >>>> >     #11 0x00000000 in ?? ()
> >>>> >
> >>>> >     Scratching the surface shows that env_findvar() is called with
> >>>> group
> >>>> >     pointer of 0x38000200, group->tg_envp is 0x380004b8, both which
> >>>> are
> >>>> >     reasonable. But *group->tg_envp is 0xA.  Further if I "watch
> >>>> >     *(int*)0x380004b8" in GDB, I see it is getting overwritten by
> >>>> >     up_serialout() invoked from stm32_serial.c::up_send.
> >>>> >
> >>>> >     Any suggestions on how I can best track this down further?
> >>>> >
> >>>> >     Thanks in advance!
> >>>> >
> >>>> >     --
> >>>> >     Peter Barada
> >>>> >     [email protected]
> >>>> >
> >>>> --
> >>>> Peter Barada
> >>>> [email protected]
> >>>>
> >>> --
> >> Peter [email protected]
> >>
> >> --
> > Peter [email protected]
> >
> >
>

Re: STM32H7 crash

Reply via email to