On 2012-09-01 10:31, Avi Kivity wrote: > On 08/29/2012 10:49 AM, Jan Kiszka wrote: >>> >>> Let's experiment with refcounting MemoryRegion. We can move the entire >>> contents of MemoryRegion to MemoryRegionImpl, add a reference count (to >>> MemoryRegionImpl), and change MemoryRegion to contain a pointer to the >>> refcounted MemoryRegionImpl: >>> >>> struct MemoryRegion { >>> struct MemoryRegionImpl *impl; >>> }; >>> >>> struct MemoryRegionImpl { >>> atomic int refs; >>> ... >>> }; >>> >>> The memory core can then store MemoryRegion copies (with elevated >>> refcounts) instead of pointers. Devices can destroy MemoryRegions at >>> any time, the implementation will not go away. However, what of the >>> opaque stored in MemoryRegionImpl? It becomes a dangling pointer. >>> >>> One way out is to add a lock to MemoryRegionImpl. Dispatch takes the >>> lock, examines the ->enabled member, and bails out if it is cleared. >>> The (MemoryRegion, not MemoryRegionImpl) destructor also takes the lock, >>> clears ->enabled, releases the lock, and drops the reference. >> >> That means holding the MemoryRegionImpl lock across the handler call? > > Blech. As you said on the other side of this thread, we must not take a > coarse grained lock within a fine grained one, and > MemoryRegionImpl::lock is as fine as they get.
Not sure what you compare here. MemoryRegionImpl::lock would be per memory region, so with finer scope than the BQL but with similar scope like a per-device lock. > >>> >>> The advantage to this scheme is that all changes are localized to the >>> memory core, no need for a full sweep. It is a little complicated, but >>> we may be able to simplify it (or find another one). >> >> May work. We just need to detect if memory region tries to delete itself >> from its handler to prevent the deadlock. > > Those types of hacks are fragile. IMO it just demonstrates what I said > earlier (then tried to disprove with this): if we call an opaque's > method, we must refcount or otherwise lock that opaque. No way around it. But that still doesn't solve the problem that we need to lock down the *state* of the opaque during dispatch /wrt to memory region changes. Just ensuring its existence is not enough unless we declare memory region transactions to be asynchronous - and adapt all users. > >> >>> >>>> >>>>> >>>>>> Besides >>>>>> MMIO and PIO dispatching, it will haunt us for file or event handlers, >>>>>> any kind of callbacks etc. >>>>>> >>>>>> Context A Context B >>>>>> --------- --------- >>>>>> object = lookup() >>>>>> deregister(object) >>>>>> modify(object) -> invalid state >>>>>> ... use(object) >>>>>> modify(object) -> valid state >>>>>> register(object) >>>>>> >>>>>> And with "object" I'm not talking about QOM but any data structure. >>>>>> >>>>> >>>>> >>>>> Context B >>>>> --------- >>>>> rcu_read_lock() >>>>> object = lookup() >>>>> if (object) { >>>>> ref(object) >>>>> } >>>>> rcu_read_unlock() >>>>> >>>>> use(object) >>>>> >>>>> unref(object) >>>>> >>>>> (substitute locking scheme to conform to taste and/or dietary >>>>> restrictions) >>>>> >>>> >>>> + wait for refcount(object) == 0 in deregister(object). That's what I'm >>>> proposing. >>> >>> Consider timer_del() called from a timer callback. It's often not doable. >> >> This is inherently synchronous already (when working against the same >> alarm timer backend). We can detect this in timer_del and skip waiting, >> as in the above scenario. > > It can always be broken. The timer callback takes the device lock to > update the device. The device mmio handler, holding the device lock, > takes the timer lock to timer_mod. Deadlock. Well, how is this solved in Linux? By waiting on the callback in hrtimer_cancel. Not by wait_on_magic_opaque (well, there is even no opaque in the hrtimer API). Jan
signature.asc
Description: OpenPGP digital signature