On 12/19/18 2:59 PM, Roger Pau Monné wrote: >> Using 'current' means that potential deadlocks which would now cause a >> BUG() won't anymore. I'm fine with not adding extra protections that >> aren't there now; but I don't want to remove protections that are. > > The lock ordering enforcement is still kept as-is, but Xen is allowed > to lock the caller mm locks in the right order (be it privileged or > not) after having locked a subject domain ones also in the correct > order.
Suppose we have mm_lock_x, with value 8, and mm_lock_y with value 16. Suppose we have H1, which grabs current->mm_lock_x and tgt->mm_lock_y. And suppose we have H2, which grabs current->mm_lock_y, and tgt->mm_lock_x. And suppose domA calls H1 on domB at the same time that domB calls H2 on domA. We could have the following sequence: * H1: grab A->mm_lock_x * H2: grab B->mm_lock_y * H1: wait on B->mm_lock_y * H2: wait on A->mm_lock_x #DEADLOCK With the current mm lock checking, H2 would cause a BUG() the first time it was called. With my "special privilege only" boosting you'd also get a BUG() in at least one of the invocations. With your "current bias" patch, no BUG() would be encountered; we'd only discover the deadlock once a live server had actually deadlocked. This is what I'm talking about -- with "boost dom0", you have a global order to the locks. It goes: 1-8: All domU locks (in the order listed in mm-locks.h) 9-16: All dom0 locks Thus we know for certain that there can't be a caller / callee deadlock that's not detected. With your patch, there isn't a global order: the order changes based on who made the hypercall, so it's more difficult to reason about whether there's a deadlock or not. So, do H1 and H2 exist right now? I think probably not, but I can't immediately say. Will such a pair *never* exist? I don't think I can guarantee that either. That's why I want something to check. > Using is_control_domain will limit the usage of paging_log_dirty_op to > Dom0 (because that's the only domain that has is_privileged set, > there's no way for the toolstack to create a domain with is_privileged > set), no other paging domain will be able to use such hypercall > without triggering the lock level check. Is that really what we > want? > > I assume that under a distributed Xen system it would be possible to > issue this hypercall from a domain different that Dom0 if it's given > the right privileges? I take it you mean a "disaggregated" system? This is what Andy was talking about with XSM. As I said to him, at the moment, this will _already_ fail, so it's not a regression. I'm not willing to open up new potential holes for deadlocks, particularly for a hypothetical use case that nobody has asked for and doesn't work upstream at the moment. If you want to make the calls succeed under a more "peer-to-peer" arrangement, you're going to have to come up with something else: Change the order of the mm locks, or grab the caller's p2m lock before grabbing the paging lock, or refactoring what's covered by what, or something. Or, we can fix what's actually broken at the moment, and fix other things when we encounter them. -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel