Re: sched lock/unlock issue on rp2350

Serg Podtynnyi Tue, 06 May 2025 20:09:16 -0700

Yeah, but this macro is working because of coincidental match withvariable prev already defined in the scope.


On 5/6/25 23:19, Alan C. Assis wrote:

This file:
https://github.com/apache/nuttx/commit/9de9f8168d6de8eab8d3a97aac21aacc4e84dd84#diff-ac1d3cade3d9ab380d40fe31f8c006b3035c7b53d04c28e23d9f1ce9a176572c


On Tue, May 6, 2025 at 1:18 PM Alan C. Assis <[email protected]> wrote:

Hi Serg,

I did an analysis here and this macro was introduced in Jan 10 2024, so it
was more than 1 year ago:


https://github.com/apache/nuttx/commit/9de9f8168d6de8eab8d3a97aac21aacc4e84dd84

BR,

Alan

On Tue, May 6, 2025 at 1:12 PM Alan C. Assis <[email protected]> wrote:

WOW! Nice catch!

Question to some people with more experience in the scheduler:

Why wasn't this issue detected before? It was added more than 9 months
ago.

Is there some way to test and enforce that nxsched_process_delivered()
and other schedule functions are working as expected?

BR,

Alan

On Tue, May 6, 2025 at 11:14 AM Serg Podtynnyi <[email protected]>
wrote:

Hi, I am still researching the sched problem on my config

Found this macro code  from last September,  looks like pre vs prev
typo, right?

#definedq_insert_mid(pre,mid,next)\
do\
{\
mid->flink =next;\
mid->blink =prev;\
pre->flink =mid;\
next->blink =mid;\
}\
while(0)



On 4/24/25 15:06, hujun260 wrote:

The lockcount can be understood as a local variable, so no race

conditions will occur.

编辑
分享

















At 2025-04-24 15:54:48, "Serg Podtynnyi"<[email protected]>

wrote:

Hi, I tried my best to understand the logic behind unlock and
merge_pending routines and come up with this fix that works for me.
I split the lockcount handling in 2 parts when lockcount is > 1 just
decrement fast as always,
but when it's 1 we enter critical section and make it 0 inside
it(pre-emption is enabled).
I think there is a small time after decrement in the if statement and
   enter_critical_section_wo_note() in original code where can race

happen.

Does this make sense? Sorry if it's a dumb question - scheduler stuff
with SMP is rocket-science level for me yet.

diff --git a/sched/sched/sched_unlock.c b/sched/sched/sched_unlock.c

index 4c7d681454..15fc39b546 100644
--- a/sched/sched/sched_unlock.c
+++ b/sched/sched/sched_unlock.c
@@ -65,13 +65,14 @@ void sched_unlock(void)

         DEBUGASSERT(rtcb == NULL || rtcb->lockcount > 0);

-      /* Check if the lock counter has decremented to zero. If so,
-       * then pre-emption has been re-enabled.
-       */
-
-      if (rtcb != NULL && --rtcb->lockcount == 0)
+      if (rtcb != NULL && rtcb->lockcount > 1)
+        {
+          rtcb->lockcount--;
+        }
+      else if (rtcb != NULL && rtcb->lockcount == 1)
           {
             irqstate_t flags = enter_critical_section_wo_note();
+          rtcb->lockcount  = 0;

             /* Note that we no longer have pre-emption disabled. */

--
2.49.0
On 4/22/25 14:32, hujun260 wrote:

It seems to have nothing to do with the race condition, and since the

rtcb is this_task, it will only be modified by one CPU.<br/><br/>As for why
your modification can solve the problem, I haven't figured it out yet. Can
you further analyze the reasons for the failure to boot?

At 2025-04-22 13:25:13, "Serg Podtynnyi"<[email protected]>

wrote:

Hi All, Gregory, Sebastien,

I'm porting NuttX on rp2350(pico2 board inside ClockWork PicoCalc)

and

having trouble getting SMP (2-core) config running

without patching the sched lock/unlock routines.

I am looking at the commit

https://github.com/apache/nuttx/commit/914ae532e68bf4ca75c758d6f541a1d2fcdce8c3

and reverting it helps to boot the system.

This the the patch to make it work for me, I updated the current
lock/unlock by moving the enter_critical_section above the access to
rtcb->lockcount

Do you think it could be the case of a race condition with

rtcb->lockcount ?

PS

It's almost 11 years since I last touched NuttX - it's a please to

work

with it again.


   From 86fa0a65fdd37804154faf3db52e2826281cdfbd Mon Sep 17 00:00:00

From: Serg Podtynnyi<[email protected]>
Date: Tue, 22 Apr 2025 11:37:48 +0700
Subject: sched: update lock/unlock critical section entry

Level up enter_critical_section in sched lock/unlock to cover
rtcb->lockcount access

Signed-off-by: Serg Podtynnyi<[email protected]>
---
    sched/sched/sched_lock.c   | 4 ++--
    sched/sched/sched_unlock.c | 4 ++--
    2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/sched/sched/sched_lock.c b/sched/sched/sched_lock.c
index d8f75e7c90..9cc8da3df4 100644
--- a/sched/sched/sched_lock.c
+++ b/sched/sched/sched_lock.c
@@ -82,18 +82,18 @@ void sched_lock(void)
           * operations on this thread (on any CPU)
           */

+      irqstate_t flags = enter_critical_section_wo_note();
          if (rtcb != NULL && rtcb->lockcount++ == 0)
            {
    #if (CONFIG_SCHED_CRITMONITOR_MAXTIME_PREEMPTION >= 0) || \
        defined(CONFIG_SCHED_INSTRUMENTATION_PREEMPTION)
-          irqstate_t flags = enter_critical_section_wo_note();

              /* Note that we have pre-emption locked */

              nxsched_critmon_preemption(rtcb, true,

return_address(0));

              sched_note_preemption(rtcb, true);
-          leave_critical_section_wo_note(flags);
    #endif
            }
+      leave_critical_section_wo_note(flags);
        }
    }
diff --git a/sched/sched/sched_unlock.c b/sched/sched/sched_unlock.c
index 4c7d681454..d21e007587 100644
--- a/sched/sched/sched_unlock.c
+++ b/sched/sched/sched_unlock.c
@@ -69,9 +69,9 @@ void sched_unlock(void)
           * then pre-emption has been re-enabled.
           */

+      irqstate_t flags = enter_critical_section_wo_note();
          if (rtcb != NULL && --rtcb->lockcount == 0)
            {
-          irqstate_t flags = enter_critical_section_wo_note();

              /* Note that we no longer have pre-emption disabled. */

@@ -162,7 +162,7 @@ void sched_unlock(void)
                }
    #endif

-          leave_critical_section_wo_note(flags);
            }
+      leave_critical_section_wo_note(flags);
        }
    }
--
2.49.0


--
Serg Podtynnyi

-- Serg Podtynnyi

--
Serg Podtynnyi

--
Serg Podtynnyi

Re: sched lock/unlock issue on rp2350

Reply via email to