mirror: utilize job_exit shim

Max Reitz Sat, 25 Aug 2018 08:02:54 -0700

On 2018-08-23 00:05, John Snow wrote:
> 
> 
> On 08/22/2018 08:15 AM, Max Reitz wrote:
>> On 2018-08-17 21:04, John Snow wrote:
>>> Change the manual deferment to mirror_exit into the implicit
>>> callback to job_exit and the mirror_exit callback.
>>>
>>> This does change the order of some bdrv_unref calls and job_completed,
>>> but thanks to the new context in which we call .job_exit, this is safe
>>> to defer the possible flushing of any nodes to the job_finalize_single
>>> cleanup stage.
>>
>> Ah, right, I forgot this.  Hm, what exactly do you mean?  This function
>> is executed in the main loop, so it can make 'src' go away.  I don't see
>> any difference to before.
>>
> 
> This changes the order in which we unreference these objects; if you
> look at this patch the job_completed call I delete is in the middle of
> what becomes the .exit() callback, which means there is a subtle change
> in the ordering of how references are put down.
> 
> Take a look at the weird ordering of mirror_exit as it exists right now;
> we call job_completed first and *then* put down the last references. If
> you re-order this upstream right now, you'll deadlock QEMU because this
> means job_completed is responsible for putting down the last reference
> to some of these block/bds objects.
> 
> However, job_completed takes an additional AIO context lock and calls
> job_finalize_single under *two* locks, which will hang QEMU if we
> attempt to flush any of these nodes when we put down the last reference.


If you say so...  I have to admit I don't really understand.  The
comment doesn't explain why it's so important to keep src around until
job_completed(), so I don't know.  I thought AioContexts are recursive
so it doesn't matter whether you take them recursively or not.

Anyway.  So the difference now is that job_defer_to_main_loop() took the
lock around the whole exit function, whereas the new exit shim only
takes it around the .exit() method, but calls job_complete() without a
lock -- and then job_finalize_single() gets its lock again, so the job
methods are again called with locks.  That sounds OK to me.

> Performing the reordering here is *safe* because by removing the call to
> job_completed and utilizing the exit shim, the .exit() callback executes
> only under one lock, and when the finalize code runs later it is also
> executed under only one lock, making this re-ordering safe.
> 
> Clear as mud?

Well, I trust you that the drain issue was the reason that src had to
stay around until after job_completed().  It seems a bit
counter-intuitive, because the comment explaining that src needs to stay
around until job_completed() doesn't say much -- but it does imply that
without that bdrv_ref(), the BDS might be destroyed before
job_completed().  Which is different from simply having only one
reference left and then being deleted in job_completed().

Looking at 3f09bfbc7be, I'm inclined to believe the original reason may
be that src->job points to the job and that we shouldn't delete it as
long as it does (bdrv_delete() asserts that bs->job is NULL).  Oh no, a
tangent appears.

...I would assume that when bdrv_replace_node() is called, BlockJob.blk
is updated to point to the new BDS.  But nobody seems to update the
BDS.job field.  Investigation is in order.

Max

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 5/7] block/mirror: utilize job_exit shim

Reply via email to