On Fri, Apr 20, 2012 at 12:21 AM, andrzej zaborowski <balr...@gmail.com> wrote: > On 18 April 2012 14:35, Stefan Hajnoczi <stefa...@gmail.com> wrote: >> Recently there have been new SD card emulation patches so I want to >> raise the issue of synchronous I/O while there is focus on the SD >> subsystem. Maybe some of the people who are improving the SD >> subsystem will be able to help. >> >> sd_blk_read() and sd_blk_write() use the synchronous block I/O >> functions to read/write data on behalf of the guest. Device emulation >> runs in the vcpu thread with the QEMU global mutex held, and therefore >> both the guest vcpu and QEMU's own monitor and VNC server are >> unresponsive while bdrv_read()/bdrv_write() is blocked. >> >> This makes bdrv_read()/bdrv_write() in device emulation code a >> performance problem - the guest becomes unresponsive and laggy under >> heavy I/O. In extreme cases, like image files on NFS with a network >> connectivity issue, it can affect the reliability of QEMU as a whole >> because the monitor and VNC are unavailable until the I/O operation >> completes. >> >> Device emulation should use the bdrv_aio_readv()/bdrv_aio_writev() >> functions so that control can return to the guest. When the I/O >> operation completes a callback function is invoked and the device >> emulation can signal completion to the guest - usually by setting bits >> in hardware registers and raising an interrupt. The result is good >> responsiveness and the monitor/VNC remain available even under heavy >> I/O. >> >> The challenge is how to convert hw/sd.c and possibly update emulated >> SD controllers. We need to stop assuming that a read/write operation >> can be performed instantly and need to use a >> bdrv_aio_readv()/bdrv_aio_writev() callback function to complete the >> I/O. >> >> Since I am not familiar with the SD specification or the hw/sd.c code >> very well I want to check: >> >> * Is anyone willing to convert the SD subsystem? >> >> * Will it be possible to convert just hw/sd.c without affecting >> emulated SD controllers? >> * If we're going to need to fix all controllers in addition to >> hw/sd.c, then adding more controllers grows the problem. > > Yes, controllers would be affected, but there are various ways to go > about it. Some could be simple to implement (looking at > pxa2xx_mmci.c). First of all the SD specification pretty much assumes > the storage medium is flash and data is available "immediately" after > it is requested. The host drives the clock and there's a fixed number > of cycles that pass between a command and the response. There's a > mechanism for the card to indicate it is busy programming after data > is written, but it doesn't apply to some types of writes. > > However the number of cycles between command and response can be > different between card manufacturers, so it looks like the card can > pull either the CMD and the DAT line high before starting to send the > command response or the data. In qemu you could either make the data > transfers async, or the response transfers async, there's no need to > do both. > > If the image is on a network filesystem then there could be problems > caused by the synchronous IO. Anything else, I'd guess that the > caches, readahead and what not make sync IO the same or unnoticeably > faster overall. pxa2xx_mmci.c would be easy to convert to async, but > some host controllers that are more software than hardware might > theoretically give up if the card doesn't respond in N cycles.
Even in a case where the bus specification is strict about timing it's possible that the controllers that guest drivers talk to hide those details and instead work on an interrupt-driven basis. In other words, maybe most of the work will be converting controllers to implement the busy state while we do actual block I/O. Is this possible or do SD controllers expose the real low-level timing aspects of the bus to the guest drivers? Stefan