On 12/19/18 2:59 PM, Roger Pau Monné wrote:
>> Using 'current' means that potential deadlocks which would now cause a
>> BUG() won't anymore.  I'm fine with not adding extra protections that
>> aren't there now; but I don't want to remove protections that are.
> 
> The lock ordering enforcement is still kept as-is, but Xen is allowed
> to lock the caller mm locks in the right order (be it privileged or
> not) after having locked a subject domain ones also in the correct
> order.

Suppose we have mm_lock_x, with value 8, and mm_lock_y with value 16.

Suppose we have H1, which grabs current->mm_lock_x and tgt->mm_lock_y.

And suppose we have H2, which grabs current->mm_lock_y, and tgt->mm_lock_x.

And suppose domA calls H1 on domB at the same time that domB calls H2 on
domA.  We could have the following sequence:

* H1: grab A->mm_lock_x
* H2: grab B->mm_lock_y
* H1: wait on B->mm_lock_y
* H2: wait on A->mm_lock_x #DEADLOCK

With the current mm lock checking, H2 would cause a BUG() the first time
it was called.

With my "special privilege only" boosting you'd also get a BUG() in at
least one of the invocations.

With your "current bias" patch, no BUG() would be encountered; we'd only
discover the deadlock once a live server had actually deadlocked.

This is what I'm talking about -- with "boost dom0", you have a global
order to the locks.  It goes:

1-8: All domU locks (in the order listed in mm-locks.h)
9-16: All dom0 locks

Thus we know for certain that there can't be a caller / callee deadlock
that's not detected.  With your patch, there isn't a global order: the
order changes based on who made the hypercall, so it's more difficult to
reason about whether there's a deadlock or not.

So, do H1 and H2 exist right now?  I think probably not, but I can't
immediately say.  Will such a pair *never* exist?  I don't think I can
guarantee that either.  That's why I want something to check.

> Using is_control_domain will limit the usage of paging_log_dirty_op to
> Dom0 (because that's the only domain that has is_privileged set,
> there's no way for the toolstack to create a domain with is_privileged
> set), no other paging domain will be able to use such hypercall
> without triggering the lock level check. Is that really what we
> want?
>
> I assume that under a distributed Xen system it would be possible to
> issue this hypercall from a domain different that Dom0 if it's given
> the right privileges?

I take it you mean a "disaggregated" system?  This is what Andy was
talking about with XSM.  As I said to him, at the moment, this will
_already_ fail, so it's not a regression.

I'm not willing to open up new potential holes for deadlocks,
particularly for a hypothetical use case that nobody has asked for and
doesn't work upstream at the moment.

If you want to make the calls succeed under a more "peer-to-peer"
arrangement, you're going to have to come up with something else:
Change the order of the mm locks, or grab the caller's p2m lock before
grabbing the paging lock, or refactoring what's covered by what, or
something.

Or, we can fix what's actually broken at the moment, and fix other
things when we encounter them.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Reply via email to