On Fri, Apr 29, 2011 at 06:31:03PM +0200, Jan Kiszka wrote: > On 2011-04-29 18:20, Alex Williamson wrote: > > On Fri, 2011-04-29 at 18:07 +0200, Jan Kiszka wrote: > >> On 2011-04-29 17:55, Alex Williamson wrote: > >>> On Fri, 2011-04-29 at 17:45 +0200, Jan Kiszka wrote: > >>>> On 2011-04-29 17:38, Alex Williamson wrote: > >>>>> On Fri, 2011-04-29 at 17:29 +0200, Jan Kiszka wrote: > >>>>>> On 2011-04-29 17:06, Michael S. Tsirkin wrote: > >>>>>>> On Thu, Apr 28, 2011 at 09:15:23PM -0600, Alex Williamson wrote: > >>>>>>>> When we're trying to get a newly registered phys memory client > >>>>>>>> updated > >>>>>>>> with the current page mappings, we end up passing the region offset > >>>>>>>> (a ram_addr_t) as the start address rather than the actual guest > >>>>>>>> physical memory address (target_phys_addr_t). If your guest has less > >>>>>>>> than 3.5G of memory, these are coincidentally the same thing. If > >>>>>> > >>>>>> I think this broke even with < 3.5G as phys_offset also encodes the > >>>>>> memory type while region_offset does not. So everything became RAMthis > >>>>>> way, no MMIO was announced. > >>>>>> > >>>>>>>> there's more, the region offset for the memory above 4G starts over > >>>>>>>> at 0, so the set_memory client will overwrite it's lower memory > >>>>>>>> entries. > >>>>>>>> > >>>>>>>> Instead, keep track of the guest phsyical address as we're walking > >>>>>>>> the > >>>>>>>> tables and pass that to the set_memory client. > >>>>>>>> > >>>>>>>> Signed-off-by: Alex Williamson <alex.william...@redhat.com> > >>>>>>> > >>>>>>> Acked-by: Michael S. Tsirkin <m...@redhat.com> > >>>>>>> > >>>>>>> Given all this, can yo tell how much time does > >>>>>>> it take to hotplug a device with, say, a 40G RAM guest? > >>>>>> > >>>>>> Why not collect pages of identical types and report them as one chunk > >>>>>> once the type changes? > >>>>> > >>>>> Good idea, I'll see if I can code that up. I don't have a terribly > >>>>> large system to test with, but with an 8G guest, it's surprisingly not > >>>>> very noticeable. For vfio, I intend to only have one memory client, so > >>>>> adding additional devices won't have to rescan everything. The memory > >>>>> overhead of keeping the list that the memory client creates is probably > >>>>> also low enough that it isn't worthwhile to tear it all down if all the > >>>>> devices are removed. Thanks, > >>>> > >>>> What other clients register late? Do the need to know to whole memory > >>>> layout? > >>>> > >>>> This full page table walk is likely a latency killer as it happens under > >>>> global lock. Ugly. > >>> > >>> vhost and kvm are the only current users. kvm registers it's client > >>> early enough that there's no memory registered, so doesn't really need > >>> this replay through the page table walk. I'm not sure how vhost works > >>> currently. I'm also looking at using this for vfio to register pages > >>> for the iommu. > >> > >> Hmm, it looks like vhost is basically recreating the condensed, slotted > >> memory layout from the per-page reports now. A bit inefficient, > >> specifically as this happens per vhost device, no? And if vfio preferred > >> a slotted format as well, you would end up copying vhost logic. > >> > >> That sounds to me like the qemu core should start tracking slots and > >> report slot changes, not memory region registrations. > > > > I was thinking the same thing, but I think Michael is concerned if we'll > > each need slightly different lists. This is also where kvm is mapping > > to a fixed array of slots, which is know to blow-up with too many > > assigned devices. Needs to be fixed on both kernel and qemu side. > > Runtime overhead of the phys memory client is pretty minimal, it's just > > the startup that thrashes set_memory. > > I'm not just concerned about the runtime overhead. This is code > duplication. Even if the format of the lists differ, their structure > should not: one entry per continuous memory region, and some lists may > track sparsely based on their interests. > > I'm sure the core could be taught to help the clients creating and > maintaining such lists. We already have two types of users in tree, you > are about to create another one, and Xen should have some need for it as > well. > > Jan
Absolutely. There should be some common code to deal with slots. > -- > Siemens AG, Corporate Technology, CT T DE IT 1 > Corporate Competence Center Embedded Linux