* Michael Roth (mdr...@linux.vnet.ibm.com) wrote: > Quoting Dr. David Alan Gilbert (2019-10-18 04:43:52) > > * Laurent Vivier (lviv...@redhat.com) wrote: > > > On 18/10/2019 10:16, Dr. David Alan Gilbert wrote: > > > > * Scott Cheloha (chel...@linux.vnet.ibm.com) wrote: > > > >> savevm_state's SaveStateEntry TAILQ is a priority queue. Priority > > > >> sorting is maintained by searching from head to tail for a suitable > > > >> insertion spot. Insertion is thus an O(n) operation. > > > >> > > > >> If we instead keep track of the head of each priority's subqueue > > > >> within that larger queue we can reduce this operation to O(1) time. > > > >> > > > >> savevm_state_handler_remove() becomes slightly more complex to > > > >> accomodate these gains: we need to replace the head of a priority's > > > >> subqueue when removing it. > > > >> > > > >> With O(1) insertion, booting VMs with many SaveStateEntry objects is > > > >> more plausible. For example, a ppc64 VM with maxmem=8T has 40000 such > > > >> objects to insert. > > > > > > > > Separate from reviewing this patch, I'd like to understand why you've > > > > got 40000 objects. This feels very very wrong and is likely to cause > > > > problems to random other bits of qemu as well. > > > > > > I think the 40000 objects are the "dr-connectors" that are used to plug > > > peripherals (memory, pci card, cpus, ...). > > > > Yes, Scott confirmed that in the reply to the previous version. > > IMHO nothing in qemu is designed to deal with that many devices/objects > > - I'm sure that something other than the migration code is going to get > > upset. > > The device/object management aspect seems to handle things *mostly* okay, at > least ever since QOM child properties started being tracked by a hash table > instead of a linked list. It's worth noting that that change (b604a854) was > done to better handle IRQ pins for ARM guests with lots of CPUs. I think it is > inevitable that certain machine types/configurations will call for large > numbers of objects and I think it is fair to improve things to allow for this > sort of scalability. > > But I agree it shouldn't be abused, and you're right that there are some > problem areas that arise. Trying to outline them: > > a) introspection commands like 'info qom-tree' become pretty unwieldly, > and with large enough numbers of objects might even break things (QMP > response size limits maybe?) > b) various related lists like reset handlers, vmstate/savevm handlers might > grow quite large > > I think we could work around a) with maybe flagging certain > "internally-only" objects as 'hidden'. Introspection routines could then > filter these out, and routines like qom-set/qom-get could return report > something similar to EACCESS so they are never used/useful to management > tools. > > In cases like b) we can optimize things where it makes sense like with > Scott's patch here. In most cases these lists need to be walked one way > or another, whether it's done internally by the object or through common > interfaces provided by QEMU. It's really just the O(n^2) type handling > where relying on common interfaces becomes drastically less efficient, > but I think we should avoid implementing things in that way anyway, or > improve them as needed. > > > > > Is perhaps the structure wrong somewhere - should there be a single DRC > > device that knows about all DRCs? > > That's an interesting proposition, I think it's worth exploring further, > but from a high level: > > - each SpaprDrc has migration state, and some sub-classes SpaprDrc (e.g. > SpaprDrcPhysical) have additional migration state. These are sent > as-needed as separate VMState entries in the migration stream. > Moving to a single DRC means we're either sending them as an flat > array or a sparse list, which would put just as much load on the > migration code (at least, with Scott's changes in place). It would > also be difficult to do all this in a way which maintains migration > compatibility with older machine types.
Having sparse arrays etc within a vmstate isn't as bad; none of them actually need to be 'objects' as such - even if you have separate chunks of VMState. > - other aspects of modeling these as QOM objects, such as look-ups, > reset-handling, and memory allocations, wouldn't be dramatically > improved upon by handling it all internally within the object > > AFAICT the biggest issue with modeling the DRCs as individual objects > is actually how we deal with introspection, and we should try to > improve. What do you think of the alternative suggestion above of > marking certain objects as 'hidden' from various introspection > interfaces? That's one for someone who knows/cares about QOM more than me; Paolo, Dan Berrange, or Eduardo Habkost are QOM people. Dave > > > > Dave > > > > > > > https://github.com/qemu/qemu/blob/master/hw/ppc/spapr_drc.c > > > > > > They are part of SPAPR specification. > > > > > > https://raw.githubusercontent.com/qemu/qemu/master/docs/specs/ppc-spapr-hotplug.txt > > > > > > CC Michael Roth > > > > > > Thanks, > > > Laurent > > -- > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK > > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK