Re: [Qemu-devel] [PATCH 10/10] qdev: fix create in place obj's life cycle problem

Avi Kivity Wed, 29 Aug 2012 10:31:20 -0700

On 08/29/2012 10:21 AM, Jan Kiszka wrote:
> On 2012-08-29 19:13, Avi Kivity wrote:
> > On 08/27/2012 06:01 PM, Jan Kiszka wrote:
> >> On 2012-08-27 22:53, Avi Kivity wrote:
> >>> On 08/27/2012 12:38 PM, Jan Kiszka wrote:
> >>>>>> Even worse, apply
> >>>>>> restrictions on how the dispatched objects, the regions, have to be
> >>>>>> treated because of this.
> >>>>>
> >>>>> Please elaborate.
> >>>>
> >>>> The fact that you can't manipulate a memory region object arbitrarily
> >>>> after removing it from the mapping because you track the references to
> >>>> the object that the region points to, not the region itself. The region
> >>>> remains in use by the dispatching layer and potentially the called
> >>>> device, even after deregistration.
> >>>
> >>> That object will be a container_of() the region, usually literally but
> >>> sometimes only in spirit.  Reference counting the regions means they
> >>> cannot be embedded into other objects any more.
> >>
> >> I cannot follow the logic of the last sentence. Reference counting just
> >> means that we track if there are users left, not necessarily that we
> >> perform asynchronous releases. We can simply wait for those users to
> >> complete.
> > 
> > I don't see how.  Suppose you add a reference count to MemoryRegion. 
> > How do you delay its containing object's destructor from running?  Do
> > you iterate over all member MemoryRegion and examine their reference counts?
>
> memory_region_transaction_commit (or calls that trigger this) will have
> to wait. That is required as the caller may start fiddling with the
> region right afterward.


However, it may be running within an mmio dispatch, so it may wait forever.

Ignoring this, how does it wait? sleep()? or wait for a completion?

> > 
> > Usually a reference count controls the lifetime of the reference counted
> > object, what are you suggesting here?
>
> To synchronize on reference count going to zero. 

Quite unorthodox.  Do you have real life examples?

> Or all readers leaving
> the read-side critical sections.

This is rcu.  But again, we may be in an rcu read-side critical section
while needing to wait.  In fact this was what I suggested in the first
place, until Marcelo pointed out the deadlock, so we came up with the
refcount.

>
> > 
> >>>
> >>> We can probably figure out a way to flush out accesses.  After switching
> >>> to rcu, for example, all we need is synchronize_rcu() in a
> >>> non-deadlocking place.  But my bet is that it will not be needed.
> >>
> >> If you properly flush out accesses, you don't need to track at device
> >> level anymore - and mess with abstraction layers. That's my whole point.
> > 
> > To flush out an access you need either rwlock_write_lock() or
> > synchronize_rcu() (depending on the implementation).  But neither of
> > these can be run from an rcu read-side critical section or
> > rwlock_read_lock().
> > 
> > You could defer the change to a bottom half, but if the hardware demands
> > that the change be complete before returning, that doesn't work.
>
> Right, therefore synchronous transactions.

I don't follow.  Synchronous transactions mean you can't
synchronize_rcu() or upgrade the lock or wait for the refcount.  That's
the source of the problem!

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

Re: [Qemu-devel] [PATCH 10/10] qdev: fix create in place obj's life cycle problem

Reply via email to