On 4/4/2023 3:32 PM, Fotis Panagiotopoulos wrote:
I am trying again on my original setup (not the simplified defconfig that I provided later). I can now see how this is related to Greg's commit. Here is the stack trace at the time of the error: https://pasteboard.co/9QEmhZJFvIHC.png This is what I think happens. nxtask_assign_pid() calls kmm_free(g_pidhash). Supposedly, right after freeing it, it should set again g_pidhash to pidhash; kmm_free however uses a semaphore. When free is complete, it posts the semaphore. nxsem_post() will internally call nxsem_checkholder() to perform the new check. This leads to a call to nxsched_get_tcb() that tries to access g_pidhash. But! g_pidhash is deallocated at this point! And thus it points to garbage. KASAN is right to complain.
That sounds right! I am glad you nailed that. Stale memory use problems can be insidious.
This is a chicken'n'egg problem: nxsem_post and nxsem_checkholder need the pid hash table; freeing hash table entries need to post. It sounds like you have a solution in mind.