On Wed, Jun 09, 2021 at 02:30:36AM -0300, Leonardo Brás wrote: > On Mon, 2021-06-07 at 15:20 +1000, David Gibson wrote: > > On Fri, Apr 30, 2021 at 11:36:10AM -0300, Leonardo Bras wrote: > > > During memory hotunplug, after each LMB is removed, the HPT may be > > > resized-down if it would map a max of 4 times the current amount of > > > memory. > > > (2 shifts, due to introduced histeresis) > > > > > > It usually is not an issue, but it can take a lot of time if HPT > > > resizing-down fails. This happens because resize-down failures > > > usually repeat at each LMB removal, until there are no more bolted > > > entries > > > conflict, which can take a while to happen. > > > > > > This can be solved by doing a single HPT resize at the end of > > > memory > > > hotunplug, after all requested entries are removed. > > > > > > To make this happen, it's necessary to temporarily disable all HPT > > > resize-downs before hotunplug, re-enable them after hotunplug ends, > > > and then resize-down HPT to the current memory size. > > > > > > As an example, hotunplugging 256GB from a 385GB guest took 621s > > > without > > > this patch, and 100s after applied. > > > > > > Signed-off-by: Leonardo Bras <leobra...@gmail.com> > > > > Hrm. This looks correct, but it seems overly complicated. > > > > AFAICT, the resize calls that this adds should in practice be the > > *only* times we call resize, all the calls from the lower level code > > should be suppressed. > > That's correct. > > > In which case can't we just remove those calls > > entirely, and not deal with the clunky locking and exclusion here. > > That should also remove the need for the 'shrinking' parameter in > > 1/3. > > > If I get your suggestion correctly, you suggest something like: > 1 - Never calling resize_hpt_for_hotplug() in > hash__remove_section_mapping(), thus not needing the srinking > parameter. > 2 - Functions in hotplug-memory.c that call dlpar_remove_lmb() would in > fact call another function to do the batch resize_hpt_for_hotplug() for > them
Basically, yes. > If so, that assumes that no other function that currently calls > resize_hpt_for_hotplug() under another path, or if they do, it does not > need to actually resize the HPT. > > Is the above correct? > > There are some examples of functions that currently call > resize_hpt_for_hotplug() by another path: > > add_memory_driver_managed > virtio_mem_add_memory > dev_dax_kmem_probe Oh... virtio-mem. I didn't think of that. > reserve_additional_memory > balloon_process > add_ballooned_pages AFAICT this comes from drivers/xen, and Xen has never been a thing on POWER. > __add_memory > probe_store So this is a sysfs triggered memory add. If the user is doing this manually, then I think it's reasonable for them to manually manage the HPT size as well, which they can do through debugfs. I think it might also be used my drmgr under pHyp, but pHyp doesn't support HPT resizing. > __remove_memory > pseries_remove_memblock Huh, this one comes through OF_RECONFIG_DETACH_NODE. I don't really know when those happen, but I strongly suspect it's only under pHyp again. > remove_memory > dev_dax_kmem_remove > virtio_mem_remove_memory virtio-mem again. > memunmap_pages > pci_p2pdma_add_resource > virtio_fs_setup_dax And virtio-fs in dax mode. Didn't think of that either. Ugh, yeah, I'm used to the world where the platform provides the only way of hotplugging memory, but virtio-mem does indeed provide another one, and we could indeed need to manage the HPT size based on that. Drat, so moving all the HPT resizing handling up into pseries/hotplug-memory.c won't work. I still think we can simplify the communication between the stuff in the pseries hotplug code and the actual hash resizing. In your draft there are kind of 3 ways the information is conveyed: the mutex suppresses HPT shrinks, pre-growing past what we need prevents HPT grows, and the 'shrinking' flag handles some edge cases. I suggest instead a single flag that will suppress all the current resizes. Not sure it technically has to be an atomic mutex, but that's probably the obvious safe choice. Then have a "resize up to target" and "resize down to target" that ignore that suppression and are no-ops if the target is in the other direction. Then you should be able to make the path for pseries hotplugs be: suppress other resizes resize up to target do the actual adds or removes resize down to target unsuppress other resizes -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature