Re: [Qemu-devel] [Spice-devel] viewing continuous guest virtual memory as continuous in qemu

Alon Levy Mon, 03 Oct 2011 01:40:46 -0700

On Mon, Oct 03, 2011 at 10:17:59AM +0200, Yonit Halperin wrote:
> On 10/02/2011 03:24 PM, Alon Levy wrote:
> >Hi,
> >
> >  I'm trying to acheive the $subject. Some background: currently spice 
> > relies on a preallocated pci bar for both surfaces and for VGA framebuffer 
> > + commands. I have been trying to get rid of the surfaces bar. To do that I 
> > allocate memory in the guest and then translate it for spice-server 
> > consumption using cpu_physical_memory_map.
> >
> >  AFAIU this works only when the guest allocates a continuous range of 
> > physical pages. This is a large requirement from the guest, which I'd like 
> > to drop. So I would like to have the guest use a regular allocator, 
> > generating for instance two sequential pages in virtual memory that are 
> > scattered in physical memory. Those two physical guest page addresses (gp1 
> > and gp2) correspond to two host virtual memory addresses (hv1, hv2). I 
> > would now like to provide to spice-server a single virtual address p that 
> > maps to those two pages in sequence. I don't want to handle my own 
> > scatter-gather list, I would like to have this mapping done once so I can 
> > use an existing library that requires a single pointer (for instance pixman 
> > or libGL) to do the rendering.
> >
> >  Is there any way to acheive that without host kernel support, in user 
> > space, i.e. in qemu? or with an existing host kernel device?
> >
> >  I'd appreciate any help,
> >
> >Alon
> >_______________________________________________
> >Spice-devel mailing list
> >spice-de...@lists.freedesktop.org
> >http://lists.freedesktop.org/mailman/listinfo/spice-devel
> 
> Hi,
> won't there be an overhead for rendering on a non continuous
> surface? Will it be worthwhile comparing to not creating the
> surface?


If I use a scatter-gather list there is overhead of allocating and
copying the surface whenever I want to synchronize. Minimally once
to copy from guest to host, and another copy from host to guest
for any update_area. (we can only copy the required area.

If I use page remapping like remap_file_pages does, I don't think
there is any overhead for rendering. There is overhead for doing
the remap_file_pages calls, but they are minimal (or so the man page
says). I should benchmark this.

The additional cost is not large - I suppose rendering should be more
costly then a memcpy. But the question is true regardless of this -
some surfaces should be punted probably, if we had an oracle to know they
would be immediately update_area'ed and destroyed.

> 
> BTW. We should test if the split to vram (surfaces) and devram
> (commands and others) is more efficient than having one section.
> Even if it is more efficient, we can remove the split and give to
> the surfaces higher allocation priority on a part of the pci bar.
> Anyway, by default, we can try allocating surfaces on the guest RAM.
> If it fails, we can try to allocate on the pci-bar.
> 

Right. What I was aiming at is removing the BAR all together. This reduces
per vm allocation, and we can still ensure a maximum via the driver. It
also reduces PCI requirements, which are a problem with more then one card.

Actually the more productive thing for reducing PCI memory would be to change
to a single card for multiple monitor support. Another reason for allocating
on guest RAM is to make migration simpler (but I'm not sure it really is).

> Cheers,
> Yonit.
> 
>

Re: [Qemu-devel] [Spice-devel] viewing continuous guest virtual memory as continuous in qemu

Reply via email to