On 10/11/2011 11:56 AM, Gleb Natapov wrote:
On Tue, Oct 11, 2011 at 11:49:16AM +0200, Avi Kivity wrote:
>  >Whatever we do, the interface will never be as fast as DMA. We will always 
have to do sanity / permission checks for every IO operation, can batch up only so 
many IO requests and in QEMU again have to call our callbacks in a loop.
>
>  We can batch per page, which makes the overhead negligible.
>
Current code batch userspace exit per 1024 bytes IIRC and changing it to
page didn't show significant improvement (also IIRC). But after io data
is copied into the kernel emulator process it byte by byte. Possible
optimization, which I didn't tried, is to check that destination memory is
not mmio and write back the whole buffer if it is the case.


All the permission checks, segment checks, register_address_increment, page table walking, can be done per page. Right now they are done per byte.

btw Intel also made this optimization, current processors copy complete cache lines instead of bytes, so they probably also do the checks just once.

--
error compiling committee.c: too many arguments to function


Reply via email to