On Mon, Oct 03, 2011 at 10:17:59AM +0200, Yonit Halperin wrote: > On 10/02/2011 03:24 PM, Alon Levy wrote: > >Hi, > > > > I'm trying to acheive the $subject. Some background: currently spice > > relies on a preallocated pci bar for both surfaces and for VGA framebuffer > > + commands. I have been trying to get rid of the surfaces bar. To do that I > > allocate memory in the guest and then translate it for spice-server > > consumption using cpu_physical_memory_map. > > > > AFAIU this works only when the guest allocates a continuous range of > > physical pages. This is a large requirement from the guest, which I'd like > > to drop. So I would like to have the guest use a regular allocator, > > generating for instance two sequential pages in virtual memory that are > > scattered in physical memory. Those two physical guest page addresses (gp1 > > and gp2) correspond to two host virtual memory addresses (hv1, hv2). I > > would now like to provide to spice-server a single virtual address p that > > maps to those two pages in sequence. I don't want to handle my own > > scatter-gather list, I would like to have this mapping done once so I can > > use an existing library that requires a single pointer (for instance pixman > > or libGL) to do the rendering. > > > > Is there any way to acheive that without host kernel support, in user > > space, i.e. in qemu? or with an existing host kernel device? > > > > I'd appreciate any help, > > > >Alon > >_______________________________________________ > >Spice-devel mailing list > >spice-de...@lists.freedesktop.org > >http://lists.freedesktop.org/mailman/listinfo/spice-devel > > Hi, > won't there be an overhead for rendering on a non continuous > surface? Will it be worthwhile comparing to not creating the > surface?
If I use a scatter-gather list there is overhead of allocating and copying the surface whenever I want to synchronize. Minimally once to copy from guest to host, and another copy from host to guest for any update_area. (we can only copy the required area. If I use page remapping like remap_file_pages does, I don't think there is any overhead for rendering. There is overhead for doing the remap_file_pages calls, but they are minimal (or so the man page says). I should benchmark this. The additional cost is not large - I suppose rendering should be more costly then a memcpy. But the question is true regardless of this - some surfaces should be punted probably, if we had an oracle to know they would be immediately update_area'ed and destroyed. > > BTW. We should test if the split to vram (surfaces) and devram > (commands and others) is more efficient than having one section. > Even if it is more efficient, we can remove the split and give to > the surfaces higher allocation priority on a part of the pci bar. > Anyway, by default, we can try allocating surfaces on the guest RAM. > If it fails, we can try to allocate on the pci-bar. > Right. What I was aiming at is removing the BAR all together. This reduces per vm allocation, and we can still ensure a maximum via the driver. It also reduces PCI requirements, which are a problem with more then one card. Actually the more productive thing for reducing PCI memory would be to change to a single card for multiple monitor support. Another reason for allocating on guest RAM is to make migration simpler (but I'm not sure it really is). > Cheers, > Yonit. > >