hi, this is a 100% stack issue. Increase all stack sizes to at least 4092. Another option is to enable full optimisation with CONFIG_DEBUG_FULLOPT=y, should also help.
quick tip: about 80% of crashes in NuttX are stack issues, the first thing you always do when such crashes occur is to increase all stack sizes :) sob., 7 lut 2026 o 04:02 Matteo Golin <[email protected]> napisał(a): > I am not familiar enough, but there should be an option for stack canaries. > I haven't had much luck with that configuration, and I imagine that your > DEBUGASSERT will trigger before stack smashing is detected. > > Matteo > > On Fri, Feb 6, 2026, 8:45 PM Peter Barada <[email protected]> wrote: > > > Haven't tried yet(personally feel should know _why_ it happens) - is > there > > a config for compiling in stack checking on function entry? > > On 2/6/26 20:22, Matteo Golin wrote: > > > > Hmmm, if the problem goes that far back it may not be worth triaging that > > way. Things have probably diverged so much since then. No luck with the > > stack increase? > > > > Matteo > > > > On Fri, Feb 6, 2026, 8:18 PM Peter Barada <[email protected]> > wrote: > > > >> Matteo, > >> > >> I'm walking back release points and have had to change board > >> configuration names(to nucleo-h743zi), rename nuttx-apps to appa, and > still > >> seeing the fault in release/11.0 branch. > >> > >> I'm trying to go back further but wondering if I'll find a bisect start > >> point... > >> On 2/6/26 17:05, Matteo Golin wrote: > >> > >> Hi Peter, > >> > >> My approach is kind of a headache since bisecting over an area where > apps > >> and NuttX are not always in sync is a major limitation of the split > repo. > >> My approach is usually: > >> > >> - Start the bisect in kernel > >> - Check the commit date of the current HEAD > >> - Check out to a commit of the same/similar date in apps > >> - Build > >> - Mentally note if this commit was good or bad based on the results of > >> running the image > >> - make distclean (avoids artifacts carrying over between bisections and > >> breaking everything) > >> - Mark commit good or bad with git bisect > >> > >> Then basically repeat this until bisecting is finished. It sucks and I > >> did suggest a script in /tools/ to try and automate most of this, but I > >> never got around to writing it. > >> > >> I would suggest you start by checking for the issue on a stable release > >> (i.e. 12.12.0) to see if that's a good commit you can start from. > Usually > >> those releases have a higher degree of testing because everyone who > voted > >> for the release ran some images on their hardware. > >> > >> That's honestly a lot of work but you never know if it'll end up being > >> faster than trying to triage with logs! > >> > >> Matteo > >> > >> On Fri, Feb 6, 2026, 4:50 PM Nathan Hartman <[email protected]> > >> wrote: > >> > >>> First place I would look: is the stack overflowing? (You could try > >>> enabling some of the stack debugging features.) > >>> > >>> On Fri, Feb 6, 2026 at 4:34 PM Peter Barada <[email protected]> > >>> wrote: > >>> > >>>> Matteo, > >>>> > >>>> I don't know if this was working before but if you can suggest a good > >>>> starting point I can cycle through git bisect to narrow down to the > >>>> failing commit. What's the best approach to using git bisect across > >>>> multiple repos (since changes in nuttx may have necessary changes in > >>>> nuttx-apps and need to keep them in sync at each build point)? > >>>> > >>>> As an aside, I also I have a nucleo-f446re board 'time ls' works fine > >>>> there. > >>>> > >>>> Further, does anyone have GDB scripts that make it easier to decipher > >>>> Nuttx structures from memory (e.g. dump task/semaphore lists, etc)? > >>>> I've > >>>> started cobbling snippets but figure I'd ask before reinventing the > >>>> wheel. > >>>> > >>>> > >>>> On 2/6/26 16:12, Matteo Golin wrote: > >>>> > Hi Peter, > >>>> > > >>>> > If you happen to know that this was working before on an older NuttX > >>>> > version, you could use git bisect to narrow down the breaking > commit. > >>>> > Then the issue might be clearer. > >>>> > > >>>> > Best, > >>>> > Matteo > >>>> > > >>>> > On Fri, Feb 6, 2026, 4:09 PM Peter Barada <[email protected]> > >>>> wrote: > >>>> > > >>>> > I have a STM32 Nucleo-h753zi board - and configured a build for > >>>> > nucleo-743zi2:nsh (which is closest board/chip; the stm32h753zi > >>>> is > >>>> > same > >>>> > as stm32h743zi but h753zi includes crypto acceleration > hardware). > >>>> > > >>>> > Build works, but if I boot and try 'time ls' nuttx faults: > >>>> > > >>>> > nsh> uname -a > >>>> > NuttX 0.0.0 9ecfff0833 Feb 6 2026 15:45:28 arm nucleo-h743zi2 > >>>> > nsh> time ls > >>>> > /: > >>>> > dev/ > >>>> > > >>>> > 0.00dump_assert_info: Current Version: NuttX 0.0.0 9ecfff0833 > >>>> > Feb 6 2026 15:45:28 arm > >>>> > dump_assert_info: Assertion failed panic: at file: :0 task: > >>>> > <noname> process: <noname> 0x800c9fd > >>>> > up_dump_register: R0: 0801e624 R1: 0000000a R2: 00000050 R3: > >>>> 0000000a > >>>> > up_dump_register: R4: 00000001 R5: 240000e4 R6: 00000000 FP: > >>>> 00000000 > >>>> > up_dump_register: R8: 00000000 SB: 00000000 SL: 00000000 R11: > >>>> 00000000 > >>>> > up_dump_register: IP: 00000000 SP: 38000c08 LR: 080059db PC: > >>>> 08005984 > >>>> > up_dump_register: xPSR: 41000000 BASEPRI: 00000000 CONTROL: > >>>> 00000000 > >>>> > up_dump_register: EXC_RETURN: ffffffe9 > >>>> > dump_stackinfo: User Stack: > >>>> > dump_stackinfo: base: 0x38000518 > >>>> > dump_stackinfo: size: 00002000 > >>>> > dump_stackinfo: sp: 0x38000c08 > >>>> > stack_dump: 0x38000be8: 00000000 00000000 00000000 00000000 > >>>> > 00000000 00000000 00000000 00000000 > >>>> > stack_dump: 0x38000c08: 0000000a 0801e624 0801e624 38000200 > >>>> > 38000fac 00000000 0801e624 080172c1 > >>>> > stack_dump: 0x38000c28: 00000000 0801e624 38000200 38000158 > >>>> > 00000000 00000000 38000fac 0800caa1 > >>>> > stack_dump: 0x38000c48: 00000000 0800cc77 0801e624 000002fc > >>>> > 38000500 00000001 00000001 38000cf0 > >>>> > stack_dump: 0x38000c68: 38000cf0 00000008 38000200 00000000 > >>>> > 00000000 0800ca79 38000500 00000001 > >>>> > stack_dump: 0x38000c88: 00000064 38000cf0 00000064 0800ca33 > >>>> > 38000500 00000001 00000064 00000000 > >>>> > stack_dump: 0x38000ca8: 00000000 08009325 00000000 38000500 > >>>> > 00000001 0800c9fd 00000000 080052f1 > >>>> > stack_dump: 0x38000cc8: 00000000 38000500 00000000 38000158 > >>>> > 00000001 00000001 00000000 00000000 > >>>> > stack_dump: 0x38000ce8: 00000000 00000000 00000000 00000000 > >>>> > 00000000 00000000 00000000 00000000 > >>>> > dump_tasks: PID GROUP PRI POLICY TYPE NPX STATE EVENT > >>>> > SIGMASK STACKBASE STACKSIZE COMMAND > >>>> > dump_task: 0 0 0 FIFO Kthread - Ready > >>>> > 0000000000000000 0x240018b0 1000 <noname> > >>>> > dump_task: 1 1 100 RR Task - Running > >>>> > 0000000000000000 0x38000518 2000 <noname> ��]���& > >>>> > > >>>> > Wondering if anyone has run across this before? Backtrace > shows: > >>>> > > >>>> > Program received signal SIGTRAP, Trace/breakpoint trap. > >>>> > exception_common () at armv7-m/arm_exception.S:127 > >>>> > 127 mrs r0, ipsr /* > R0=exception > >>>> > number */ > >>>> > where > >>>> > #0 exception_common () at armv7-m/arm_exception.S:127 > >>>> > #1 <signal handler called> > >>>> > #2 0x08005984 in env_cmpname (pszname=0x801e624 "PS1", > >>>> > peqname=0xa <error: Cannot access memory at address 0xa>) > >>>> > at environ/env_findvar.c:50 > >>>> > #3 0x080059da in env_findvar (group=0x38000200, pname=0x801e624 > >>>> > "PS1") > >>>> > at environ/env_findvar.c:105 > >>>> > #4 0x080172c0 in getenv (name=0x801e624 "PS1") at > >>>> > environ/env_getenv.c:89 > >>>> > #5 0x0800caa0 in nsh_update_prompt () at nsh_prompt.c:77 > >>>> > #6 0x0800cc76 in nsh_session (pstate=0x38000cf0, login=1, > argc=1, > >>>> > argv=0x38000500) at nsh_session.c:249 > >>>> > #7 0x0800ca78 in nsh_consolemain (argc=1, argv=0x38000500) > >>>> > at nsh_consolemain.c:77 > >>>> > #8 0x0800ca32 in nsh_main (argc=1, argv=0x38000500) at nsh_ > >>>> main.c:76 > >>>> > #9 0x08009324 in nxtask_startup (entrypt=0x800c9fd <nsh_main>, > >>>> > argc=1, > >>>> > argv=0x38000500) at sched/task_startup.c:72 > >>>> > #10 0x080052f0 in nxtask_start () at task/task_start.c:104 > >>>> > #11 0x00000000 in ?? () > >>>> > > >>>> > Scratching the surface shows that env_findvar() is called with > >>>> group > >>>> > pointer of 0x38000200, group->tg_envp is 0x380004b8, both which > >>>> are > >>>> > reasonable. But *group->tg_envp is 0xA. Further if I "watch > >>>> > *(int*)0x380004b8" in GDB, I see it is getting overwritten by > >>>> > up_serialout() invoked from stm32_serial.c::up_send. > >>>> > > >>>> > Any suggestions on how I can best track this down further? > >>>> > > >>>> > Thanks in advance! > >>>> > > >>>> > -- > >>>> > Peter Barada > >>>> > [email protected] > >>>> > > >>>> -- > >>>> Peter Barada > >>>> [email protected] > >>>> > >>> -- > >> Peter [email protected] > >> > >> -- > > Peter [email protected] > > > > >
