On 18/06/2024 12:53 am, Huang, Haitao wrote: > From: Kristen Carlson Accardi <kris...@linux.intel.com> > > Previous patches have implemented all infrastructure needed for > per-cgroup EPC page tracking and reclaiming. But all reclaimable EPC > pages are still tracked in the global LRU as sgx_epc_page_lru() always > returns reference to the global LRU. > > Change sgx_epc_page_lru() to return the LRU of the cgroup in which the > given EPC page is allocated. > > This makes all EPC pages tracked in per-cgroup LRUs and the global > reclaimer (ksgxd) will not be able to reclaim any pages from the global > LRU. However, in cases of over-committing, i.e., the sum of cgroup > limits greater than the total capacity, cgroups may never reclaim but > the total usage can still be near the capacity. Therefore a global > reclamation is still needed in those cases and it should be performed > from the root cgroup. > > Modify sgx_reclaim_pages_global(), to reclaim from the root EPC cgroup > when cgroup is enabled. Similar to sgx_cgroup_reclaim_pages(), return > the next cgroup so callers can use it as the new starting node for next > round of reclamation if needed. > > Also update sgx_can_reclaim_global(), to check emptiness of LRUs of all > cgroups when EPC cgroup is enabled, otherwise only check the global LRU. > > Finally, change sgx_reclaim_direct(), to check and ensure there are free > pages at cgroup level so forward progress can be made by the caller.
Reading above, it's not clear how the _new_ global reclaim works with multiple LRUs. E.g., the current global reclaim essentially treats all EPC pages equally when scanning those pages. From the above, I don't see how this is achieved in the new global reclaim. The changelog should: 1) describe the how does existing global reclaim work, and then describe how to achieve the same beahviour in the new global reclaim which works with multiple LRUs; 2) If there's any behaviour difference between the "existing" vs the "new" global reclaim, the changelog should point out the difference, and explain why such difference is OK. > > With these changes, the global reclamation and per-cgroup reclamation > both work properly with all pages tracked in per-cgroup LRUs. > [...] > > -static void sgx_reclaim_pages_global(struct mm_struct *charge_mm) > +static struct misc_cg *sgx_reclaim_pages_global(struct misc_cg *next_cg, > + struct mm_struct *charge_mm) > { > + if (IS_ENABLED(CONFIG_CGROUP_MISC)) > + return sgx_cgroup_reclaim_pages(misc_cg_root(), next_cg, > charge_mm); > + > sgx_reclaim_pages(&sgx_global_lru, charge_mm); > + return NULL; > } > > /* > @@ -414,12 +443,35 @@ static void sgx_reclaim_pages_global(struct mm_struct > *charge_mm) > */ > void sgx_reclaim_direct(void) > { > + struct sgx_cgroup *sgx_cg = sgx_get_current_cg(); > + struct misc_cg *cg = misc_from_sgx(sgx_cg); From below @sgx_cg could be NULL. It's not immediately clear whether calling misc_from_sgx(sgx_cg) unconditionally is safe here. Leave the initiaization of @cg to a later phase where @sgx_cg is guaranteed not being NULL, or initialize @cg to NULL here and update later. > + struct misc_cg *next_cg = NULL; > + > + /* > + * Make sure there are some free pages at both cgroup and global levels. > + * In both cases, only make one attempt of reclamation to avoid lengthy > + * block on the caller. > + */ > + if (sgx_cg && sgx_cgroup_should_reclaim(sgx_cg)) > + next_cg = sgx_cgroup_reclaim_pages(cg, next_cg, current->mm); I don't quite follow the logic. First of all, sgx_cgroup_reclaim_pages() isn't called in a loop, so why not just do: next_cg = sgx_cgroup_reclaim_pages(cg, NULL, current->mm); And what is the point of set @next_cg here, since ... > + > + if (next_cg != cg) > + put_misc_cg(next_cg); > + > + next_cg = NULL; ... here @next_cg is reset to NULL ? Looks the only reason is you need to do ... put_misc_cg(next_cg); ... above? This piece of code appears repeatedly in this file. Is there any way we can get rid of it? > if (sgx_should_reclaim_global(SGX_NR_LOW_PAGES)) > - sgx_reclaim_pages_global(current->mm); > + next_cg = sgx_reclaim_pages_global(next_cg, current->mm); And this doesn't seems "global reclaim" at all? Because it essentially equals to: next_cg = sgx_reclaim_pages_global(NULL, current->mm); which always reclaims from the ROOT. So each call to sgx_reclaim_direct() will always reclaim from the ROOT -- any other LRUs in the hierarchy will unlikely to get any chance to be reclaimed. > + > + if (next_cg != misc_cg_root()) > + put_misc_cg(next_cg); > + > + sgx_put_cg(sgx_cg); > } > > static int ksgxd(void *p) > { > + struct misc_cg *next_cg = NULL; > + > set_freezable(); > > /* > @@ -437,11 +489,15 @@ static int ksgxd(void *p) > kthread_should_stop() || > > sgx_should_reclaim_global(SGX_NR_HIGH_PAGES)); > > - if (sgx_should_reclaim_global(SGX_NR_HIGH_PAGES)) > + while (!kthread_should_stop() && > sgx_should_reclaim_global(SGX_NR_HIGH_PAGES)) { > /* Indirect reclaim, no mm to charge, so NULL: */ > - sgx_reclaim_pages_global(NULL); > + next_cg = sgx_reclaim_pages_global(next_cg, NULL); > + cond_resched(); Should the 'put_misc_cg(next_cg)' be done within the while() loop but not below? > + } > > - cond_resched(); > + if (next_cg != misc_cg_root()) > + put_misc_cg(next_cg); > + next_cg = NULL; Again, it doesn't seems "global reclaim" here, since you always restart from the ROOT once the target pages have been reclaimed. AFAICT it's completely different from the existing global reclaim. > } > > return 0; > @@ -583,6 +639,7 @@ int sgx_unmark_page_reclaimable(struct sgx_epc_page *page) > */ > struct sgx_epc_page *sgx_alloc_epc_page(void *owner, enum sgx_reclaim > reclaim) > { > + struct misc_cg *next_cg = NULL; > struct sgx_cgroup *sgx_cg; > struct sgx_epc_page *page; > int ret; > @@ -616,10 +673,19 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, > enum sgx_reclaim reclaim) > break; > } > > - sgx_reclaim_pages_global(current->mm); > + /* > + * At this point, the usage within this cgroup is under its > + * limit but there is no physical page left for allocation. > + * Perform a global reclaim to get some pages released from any > + * cgroup with reclaimable pages. > + */ > + next_cg = sgx_reclaim_pages_global(next_cg, current->mm); > cond_resched(); > } Ditto IIUC.