hujun260 commented on PR #16673: URL: https://github.com/apache/nuttx/pull/16673#issuecomment-3107048841
> > > Doesn't this actually simplify it a lot? It is basically removing a lot of redundant paths for scheduling between smp cores. > > > I only measured the overall performance on a system with several tasks processing data with high frequency, and see no change. > > > Do you have some specific tool in mind, to measure the context switching performance? > > > I'll try to make this smaller by keeping rr out of it, sometime next week! > > > > > > Take the frequently called nxsched_add_readytorun/nxsched_process_delivered as examples. > > Before your modification: > > If a context switch occurs on the current CPU, there is no need to call any loop function. If a context switch occurs on another CPU, the loop operation only needs to be called once. > > After your modification: At least two or three loop functions need to be called. > > As for performance evaluation, it is actually not too difficult. > > For example, there are 2 tasks, in two cases: on the same core or on different cores. > > For instance, taskB is "waiting", and taskA initiates a "post" operation. > > We need to calculate the time from when taskA initiates the operation to when taskB is awakened. > > You can refer to the screenshots below <img alt="image" width="1203" height="638" src="https://private-user-images.githubusercontent.com/128452594/468448373-d3da0394-49fc-440f-8002-7568fac67dd2.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NTMxNjE5NzUsIm5iZiI6MTc1MzE2MTY3NSwicGF0aCI6Ii8xMjg0NTI1OTQvNDY4NDQ4MzczLWQzZGEwMzk0LTQ5ZmMtNDQwZi04MDAyLTc1NjhmYWM2N2RkMi5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjUwNzIyJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI1MDcyMlQwNTIxMTVaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1hNjQ1MTU2NzIwMTJjMWJjNmE4ZjlmMzNmOTg5MTk5N2IwZDAyMGM1ODg0YzdlOTFmNjVjMTkyN2YxNjZjYTZhJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.OUCgaRLOteLv_MTt7KXVhnEN3eSG0IVPDpV-grrsFkM"> > > Thanks! > > I made an improved version: > > * Removed the whole "mergepending", it didn't do anything reasonable any more > * Removed other dead code (merge_prioritized) > * Fixed a couple of other small issues > > Overall performance in my tests is slightly better than before. In the implementation of nxsched_add_readytorun, we should first call nxsched_select_cpu while retaining part of the previous implementation. If scheduling to other cores, this can avoid calling nxsched_add_prioritized, thus improving performance. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@nuttx.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org