On Tue, Sep 11, 2012 at 8:41 PM, Avi Kivity <a...@redhat.com> wrote: > On 09/11/2012 03:24 PM, Avi Kivity wrote: >> On 09/11/2012 12:57 PM, Jan Kiszka wrote: >>> On 2012-09-11 11:44, liu ping fan wrote: >>>> On Tue, Sep 11, 2012 at 4:35 PM, Avi Kivity <a...@redhat.com> wrote: >>>>> On 09/11/2012 10:51 AM, Liu Ping Fan wrote: >>>>>> From: Liu Ping Fan <pingf...@linux.vnet.ibm.com> >>>>>> >>>>>> The func call chain can suffer from recursively hold >>>>>> qemu_mutex_lock_iothread. We introduce lockmap to record the >>>>>> lock depth. >>>>> >>>>> What is the root cause? io handlers initiating I/O? >>>>> >>>> cpu_physical_memory_rw() can be called nested, and when called, it can >>>> be protected by no-lock/device lock/ big-lock. >>>> I think without big lock, io-dispatcher should face the same issue. >>>> As to main-loop, have not carefully consider, but at least, dma-helper >>>> will call cpu_physical_memory_rw() with big-lock. >>> >>> That is our core problem: inconsistent invocation of existing services >>> /wrt locking. For portio, I was lucky that there is no nesting and I was >>> able to drop the big lock around all (x86) call sites. But MMIO is way >>> more tricky due to DMA nesting. >> >> Maybe we need to switch to a continuation style. Instead of expecting >> cpu_physical_memory_rw() to complete synchronously, it becomes an >> asynchronous call and you provide it with a completion. That means >> devices which use it are forced to drop the lock in between. Block and >> network clients will be easy to convert since they already use APIs that >> drop the lock (except for accessing the descriptors). >> >>> We could try to introduce a different version of cpu_physical_memory_rw, >>> cpu_physical_memory_rw_unlocked. But the problem remains that an MMIO >>> request can trigger the very same access in a nested fashion, and we >>> will have to detect this to avoid locking up QEMU (locking up the guest >>> might be OK). >> >> An async version of c_p_m_rw() will just cause a completion to bounce >> around, consuming cpu but not deadlocking anything. If we can keep a >> count of the bounces, we might be able to stall it indefinitely or at >> least ratelimit it. >> > > Another option is to require all users of c_p_m_rw() and related to use > a coroutine or thread. That makes the programming easier (but still > required a revalidation after the dropped lock). > For the nested cpu_physical_memory_rw(), we change it internal but keep it sync API as it is. (Wrapper current cpu_physical_memory_rw() into cpu_physical_memory_rw_internal() )
LOCK() // can be device or big lock or both, depend on caller. .............. cpu_physical_memory_rw() { UNLOCK() //unlock all the locks queue_work_on_thread(cpu_physical_memory_rw_internal, completion); // cpu_physical_memory_rw_internal can take lock(device,biglock) again wait_for_completion(completion) LOCK() } .................. UNLOCK() Although cpu_physical_memory_rw_internal() can be freed to use lock, but with precondition. -- We still need to trace lock stack taken by cpu_physical_memory_rw(), so that it can return to caller correctly. Is that OK? Regards, pingfan > -- > error compiling committee.c: too many arguments to function