Nathan Lynch <nath...@linux.ibm.com> writes: > Michael Ellerman <m...@ellerman.id.au> writes: >> Nathan Lynch <nath...@linux.ibm.com> writes: >>> Laurent Dufour <lduf...@linux.ibm.com> writes: >>>> Le 28/07/2020 à 19:37, Nathan Lynch a écrit : >>>>> The drmem lmb list can have hundreds of thousands of entries, and >>>>> unfortunately lookups take the form of linear searches. As long as >>>>> this is the case, traversals have the potential to monopolize the CPU >>>>> and provoke lockup reports, workqueue stalls, and the like unless >>>>> they explicitly yield. >>>>> >>>>> Rather than placing cond_resched() calls within various >>>>> for_each_drmem_lmb() loop blocks in the code, put it in the iteration >>>>> expression of the loop macro itself so users can't omit it. >>>> >>>> Is that not too much to call cond_resched() on every LMB? >>>> >>>> Could that be less frequent, every 10, or 100, I don't really know ? >>> >>> Everything done within for_each_drmem_lmb is relatively heavyweight >>> already. E.g. calling dlpar_remove_lmb()/dlpar_add_lmb() can take dozens >>> of milliseconds. I don't think cond_resched() is an expensive check in >>> this context. >> >> Hmm, mostly. >> >> But there are quite a few cases like drmem_update_dt_v1(): >> >> for_each_drmem_lmb(lmb) { >> dr_cell->base_addr = cpu_to_be64(lmb->base_addr); >> dr_cell->drc_index = cpu_to_be32(lmb->drc_index); >> dr_cell->aa_index = cpu_to_be32(lmb->aa_index); >> dr_cell->flags = cpu_to_be32(drmem_lmb_flags(lmb)); >> >> dr_cell++; >> } >> >> Which will compile to a pretty tight loop at the moment. >> >> Or drmem_update_dt_v2() which has two loops over all lmbs. >> >> And although the actual TIF check is cheap the function call to do it is >> not free. >> >> So I worry this is going to make some of those long loops take even >> longer. > > That's fair, and I was wrong - some of the loop bodies are relatively > simple, not doing allocations or taking locks, etc. > > One way to deal is to keep for_each_drmem_lmb() as-is and add a new > iterator that can reschedule, e.g. for_each_drmem_lmb_slow().
If we did that, how many call-sites would need converting? Is it ~2 or ~20 or ~200? > On the other hand... it's probably not too strong to say that the > drmem/hotplug code is in crisis with respect to correctness and > algorithmic complexity, so those are my overriding concerns right > now. Yes, this change will pessimize loops that are reinitializing the > entire drmem_lmb array on every DLPAR operation, but: > > 1. it doesn't make any user of for_each_drmem_lmb() less correct; > 2. why is this code doing that in the first place, other than to > accommodate a poor data structure choice? > > The duration of the system calls where this code runs are measured in > minutes or hours on large configurations because of all the behaviors > that are at best O(n) with the amount of memory assigned to the > partition. For simplicity's sake I'd rather defer lower-level > performance considerations like this until the drmem data structures' > awful lookup properties are fixed -- hopefully in the 5.10 timeframe. Yeah fair enough. cheers