jasonbu opened a new pull request, #16344:
URL: https://github.com/apache/nuttx/pull/16344

   ## Summary
   The l1entry replace cause huge time when text/data is big,
   Need 16KB per task to hold l1table for arm-v7a.
   Also critical_section no longer required, as table per-task maintained.
   
   For complex goldfish emulator scarios can speed up:
        enter_nsh/application_init_done,
   from 23.517300/05:07.067500
   to 01.501500/00:06.587300
   at least 10 times faster.
   
   when review the code, find the not required critical_section with in 
   `up_invalidate_dcache_all` API, make a patch to remove it.
   
   For compatible with va2pa driver usage, if the l2 entry is inside the
   va map region, ptr from random va2pa possible cause occupation of l2 entry.
   
   Before patch arm_addrenv_create_region memory layout is:
   l2entry_0 - pa_0 x256 - l2entry_1 - pa_1 x256 ...
   If va2pa access the tail of first pa_0 x256, then the memcpy etc will 
overwrite the l2entry_1.
   
   After patch arm_addrenv_create_region memory layout updated to:
   l2entry_0 - l2_entry_1 - ... pa_0 x256 - pa_1 x256 ...
   We can more safe to use va2pa avoid the l2entry to be occupied.
   
   ## Impact
   When try to use kernel-build with complex apps, the arm-v7a swap method is 
too slow.
   for my case, it take **5min** to finish the rcS, and after patch only need 
**6.5s**
   
   ## Testing
   CI-test, local goldfish emulator for complex usage with screen, connection 
etc.
   ubuntu 24.04 with qemu-v7a-virt knsh-smp. (master code cannot pass ostest, 
patch didnot bring new break).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@nuttx.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to