When not holding the PoD lock across the entire region covering P2M update and stats update, the entry count - if to be incorrect at all - should indicate too large a value in preference to a too small one, to avoid functions bailing early when they find the count is zero. However, instead of moving the increment ahead (and adjust back upon failure), extend the PoD-locked region.
Fixes: 99af3cd40b6e ("x86/mm: Rework locking in the PoD layer") Signed-off-by: Jan Beulich <jbeul...@suse.com> --- The locked region could be shrunk again, by having multiple unlock calls. But I think both ioreq_request_mapcache_invalidate() and domain_crash() are fair enough to call with the lock still held? --- v3: Extend locked region instead. Add Fixes: tag. v2: Add comments. --- a/xen/arch/x86/mm/p2m-pod.c +++ b/xen/arch/x86/mm/p2m-pod.c @@ -1348,16 +1348,22 @@ mark_populate_on_demand(struct domain *d } } + /* + * P2M update and stats increment need to collectively be under PoD lock, + * to prevent code elsewhere observing PoD entry count being zero despite + * there actually still being PoD entries (created by the p2m_set_entry() + * invocation below). + */ + pod_lock(p2m); + /* Now, actually do the two-way mapping */ rc = p2m_set_entry(p2m, gfn, INVALID_MFN, order, p2m_populate_on_demand, p2m->default_access); if ( rc == 0 ) { - pod_lock(p2m); p2m->pod.entry_count += 1UL << order; p2m->pod.entry_count -= pod_count; BUG_ON(p2m->pod.entry_count < 0); - pod_unlock(p2m); ioreq_request_mapcache_invalidate(d); } @@ -1373,6 +1379,8 @@ mark_populate_on_demand(struct domain *d domain_crash(d); } + pod_unlock(p2m); + out: gfn_unlock(p2m, gfn, order);