On Tue, 2013-06-04 at 18:50 +0200, Paolo Bonzini wrote: > Il 04/06/2013 15:24, Paolo Bonzini ha scritto: > > Il 04/06/2013 14:36, Peter Maydell ha scritto: > >> On 4 June 2013 13:31, Paolo Bonzini <pbonz...@redhat.com> wrote: > >>> Il 04/06/2013 14:24, Peter Maydell ha scritto: > >>>> On 4 June 2013 13:13, Paolo Bonzini <pbonz...@redhat.com> wrote: > >>>> This is much less flexible than just using sysbus_mmio_get_region(), > >>>> because it only lets you pass the whole set of MMIOs from the > >>>> other device through, not just the ones you want. > >>> > >>> How is this different from sysbus_pass_irq? > >> > >> sysbus_pass_irq is also an annoyingly inflexible function. > >> With MMIOs we have the advantage of being able to do better. > > > > I prefer consistency to useless flexibility. > > > > The day someone will need it, they can add sysbus_pass_one_{irq,mmio}. > > > >>>> Please just make reference counting work properly with passing > >>>> MemoryRegion*s around. > >>> > >>> Do you have any idea that doesn't require touch 800 invocation of the > >>> region creation functions? > >> > >> I think that would be a straightforward and easy to understand > >> way to define the ownership rules so I would much rather we > >> did that. I really don't like the way your current patch > >> is doing something complicated in an attempt to avoid this. > > > > They are straightforward, documented, and the wide majority of the > > devices need not care at all about them. By contrast, changing 800 > > invocations of the functions would be impossible to review seriously, it > > would have to be redone when boards are qdev/QOM-ified, would be worse > > for submitters of new boards. > > > > There are an order of magnitude less calls to memory_region_set_owner > > than to memory_region_init_*. Changing four places suffices to get > > ownership for 97% of the devices (309 files in hw/ call > > memory_region_init*, 9 devices call memory_region_set_owner): > > > > hw/core/sysbus.c:1 > > hw/isa/isa-bus.c:1 > > hw/pci/pci.c:1 > > ioport.c:2 > > > > Of the remaining calls, 2/3 of them are concentrated in a handful of > > devices: > > > > hw/display/cirrus_vga.c:7 > > hw/display/vga.c:4 > > hw/i386/kvm/pci-assign.c:7 > > hw/misc/vfio.c:8 > > A closer look at the code, and a better grep command, changed this down to: > > hw/display/cirrus_vga.c:2 > hw/display/qxl.c:1 > hw/display/vga-isa.c:1 > hw/display/vga.c:6 > hw/misc/vfio.c:8 > > (plus the ones quoted below) where all the calls in cirrus, qxl and vfio > could be removed realtively easily. In particular, VGA card > implementations do not use pci_register_vga yet, because it's a new API. > > So, with further refactoring, it could be brought down to 1 or 2 calls > in hw/display/vga.c and one in vga-isa.c. hw/display/vga.c needs the > memory_region_set_owner calls, both because of optimization tricks, and > because the code tries to cover both ISA and PCI. vga-isa.c needs it > because VGAs are special ISA devices, the only ones that do MMIO. > > Now, doing all the refactoring may not be worthwhile, but it shows that > the abstraction is even less leaky than it sounds. > > I asked Alex Williamson to read the thread and share his opinion. > Interestingly, he had a different mental model of building the memory > regions (passing them to PCI core early rather than late, and that's why > VFIO needed 8 calls in this series). So I believe his input will be useful.
We'll see about that ;) It's true that it's simply a mental model of doing the required steps, then optimizing that makes vfio need a sprinkling of set ownership calls. Paolo, your patch to move the PCI/VGA registration later solves this and completely hides memory region ownership from vfio. That's great, but as Peter is arguing, leaves a hole that I'm not even aware that an owner is required for a memory region and the API still leaves me lots of opportunities to get it wrong. So, I have to go back to Rusty's API design guidelines that an API should difficult to use incorrectly. From what I see, I'm not sure we have that here. An ugly compromise might be a runtime checks for orphan memory regions after a device is initialized, but that has it's own set of problems. Thanks, Alex