On 3/12/2024 5:12 AM, Nathan Hartman wrote:
Try Alan's suggestion to use stack monitor, and that will help understand
if there is something wrong. (If it shows that old stack size was OK, while
we know corruption was happening, then we will know to look for some out of
bound write.)
Does the stac
After enlarging the stack size of "AppBringUp" thread, the remote node can boot NSH
on RPMSGFS now. I am sorry for not trying this earlier. I was browsing the "rpmsgfs.c"
blindly and noticed a few auto variables defined in the stack... then I thought it might worth a try so
I did it.
That is
On 3/12/2024 1:10 AM, yfliu2008 wrote:
On the other hand, if we choose not mounting NSH from the RPMSGFS, it can boot
smoothly and after boot we can manually mount the RPMSGFS for playing.
That sounds like an initialization sequencing problem. Perhaps
something is getting used before it has be
meminfo() can be helpful too. It detects many heap corruption problems
(but perhaps not all?). By sprinkling a few calls to kmm_meminfo() in
choice locations, you should also be able to isolate the culprit.
Perhaps after each time the lopri worker runs or after each rpmsg.
On 3/11/2024 1:20
Is there a way to colorize heap to track down the bandid? Like CRC pattern
on all the spaces around and check on every call that the CRC pattern ist
still OK?
Gregory Nutt schrieb am Mo., 11. März 2024, 19:27:
> If the memory location that is corrupted is consistent, then you can
> monitor that
If the memory location that is corrupted is consistent, then you can
monitor that location to find the culprit (perhaps using debug output).
If your debugger supports it then setting a watchpoint could also
trigger a break when the corruption occurs.
Maybe you can also try disabling features
What's needed is some way to binary search where the culprit is.
If I understand correctly, it looks like the crash is happening in the
later stages of board bring-up? What is running before that? Can parts
be disabled or skipped to see if the problem goes away?
Another idea is to try running a s
The reason that the error is confusing is because the error probably did
not occur at the time of the assertion; it probably occurred much earlier.
In most crashes due to heap corruption there are two players: the
culprit and the victim threads. The culprit thread actually cause the
corrupti
On 3/10/2024 4:38 AM, yfliu2008 wrote:
Dear experts,
When doing regression check on K230 with a previously working Kernel mode
configuration, I got assertion error like below:
#0 _assert (filename=0x704c598 "mm_heap/mm_malloc.c", linenum=245, msg=0x0,regs=0x7082730
This does indicate