Michael Ellerman <m...@ellerman.id.au> writes: > Nathan Lynch <nath...@linux.ibm.com> writes: >> Michael Ellerman <m...@ellerman.id.au> writes: >>> Nathan Lynch <nath...@linux.ibm.com> writes: >>>> Laurent Dufour <lduf...@linux.ibm.com> writes: >>>>> Le 28/07/2020 à 19:37, Nathan Lynch a écrit : >>>>>> The drmem lmb list can have hundreds of thousands of entries, and >>>>>> unfortunately lookups take the form of linear searches. As long as >>>>>> this is the case, traversals have the potential to monopolize the CPU >>>>>> and provoke lockup reports, workqueue stalls, and the like unless >>>>>> they explicitly yield. >>>>>> >>>>>> Rather than placing cond_resched() calls within various >>>>>> for_each_drmem_lmb() loop blocks in the code, put it in the iteration >>>>>> expression of the loop macro itself so users can't omit it. >>>>> >>>>> Is that not too much to call cond_resched() on every LMB? >>>>> >>>>> Could that be less frequent, every 10, or 100, I don't really know ? >>>> >>>> Everything done within for_each_drmem_lmb is relatively heavyweight >>>> already. E.g. calling dlpar_remove_lmb()/dlpar_add_lmb() can take dozens >>>> of milliseconds. I don't think cond_resched() is an expensive check in >>>> this context. >>> >>> Hmm, mostly. >>> >>> But there are quite a few cases like drmem_update_dt_v1(): >>> >>> for_each_drmem_lmb(lmb) { >>> dr_cell->base_addr = cpu_to_be64(lmb->base_addr); >>> dr_cell->drc_index = cpu_to_be32(lmb->drc_index); >>> dr_cell->aa_index = cpu_to_be32(lmb->aa_index); >>> dr_cell->flags = cpu_to_be32(drmem_lmb_flags(lmb)); >>> >>> dr_cell++; >>> } >>> >>> Which will compile to a pretty tight loop at the moment. >>> >>> Or drmem_update_dt_v2() which has two loops over all lmbs. >>> >>> And although the actual TIF check is cheap the function call to do it is >>> not free. >>> >>> So I worry this is going to make some of those long loops take even >>> longer. >> >> That's fair, and I was wrong - some of the loop bodies are relatively >> simple, not doing allocations or taking locks, etc. >> >> One way to deal is to keep for_each_drmem_lmb() as-is and add a new >> iterator that can reschedule, e.g. for_each_drmem_lmb_slow(). > > If we did that, how many call-sites would need converting? > Is it ~2 or ~20 or ~200?
At a glance I would convert 15-20 out of the 24 users in the tree I'm looking at. Let me know if I should do a v2 with that approach.