On 2012-09-11 11:54, Avi Kivity wrote: > On 09/11/2012 12:44 PM, liu ping fan wrote: >> On Tue, Sep 11, 2012 at 4:35 PM, Avi Kivity <a...@redhat.com> wrote: >>> On 09/11/2012 10:51 AM, Liu Ping Fan wrote: >>>> From: Liu Ping Fan <pingf...@linux.vnet.ibm.com> >>>> >>>> The func call chain can suffer from recursively hold >>>> qemu_mutex_lock_iothread. We introduce lockmap to record the >>>> lock depth. >>> >>> What is the root cause? io handlers initiating I/O? >>> >> cpu_physical_memory_rw() can be called nested, and when called, it can >> be protected by no-lock/device lock/ big-lock. > > Then we should look for a solution that is local to exec.c (and the > nested dispatch problem). I think we can identify and fix all nested > dispatches (converting them to either async dma, or letting the memory > core do a single dispatch without indirection through I/O handlers). > >> I think without big lock, io-dispatcher should face the same issue. >> As to main-loop, have not carefully consider, but at least, dma-helper >> will call cpu_physical_memory_rw() with big-lock. > > DMA is inherently asynchronous, so we already drop the lock between > initiation and completion; we need to find a way to make it easy to use > the API without taking the lock while the transfer takes place.
We will have to review/rework device models that want to use the new locking scheme in a way that they can drop their own lock while issuing DMA. But that is surely non-trivial. The other option is to keep DMA requests issued by devices synchronous but let them fail if we are about to lock up. Still requires changes, but is probably more comprehensible for device model developers. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux