On Mon, Oct 11, 2021 at 10:17:08AM +0200, Jan Beulich wrote: > With NPT or shadow in use, the p2m_set_entry() -> p2m_pt_set_entry() -> > write_p2m_entry() -> p2m_flush_nestedp2m() call sequence triggers a lock > order violation when the PoD lock is held around it. Hence such flushing > needs to be deferred. Steal the approach from p2m_change_type_range(). > > Similarly for EPT I think ept_set_entry() -> ept_sync_domain() -> > ept_sync_domain_prepare() -> p2m_flush_nestedp2m() is affected. > > Reported-by: Elliott Mitchell <ehem+...@m5p.com> > Signed-off-by: Jan Beulich <jbeul...@suse.com> > > --- a/xen/arch/x86/mm/p2m-pod.c > +++ b/xen/arch/x86/mm/p2m-pod.c > @@ -24,6 +24,7 @@ > #include <xen/mm.h> > #include <xen/sched.h> > #include <xen/trace.h> > +#include <asm/hvm/nestedhvm.h> > #include <asm/page.h> > #include <asm/paging.h> > #include <asm/p2m.h> > @@ -494,6 +495,13 @@ p2m_pod_offline_or_broken_replace(struct > static int > p2m_pod_zero_check_superpage(struct p2m_domain *p2m, gfn_t gfn); > > +static void pod_unlock_and_flush(struct p2m_domain *p2m) > +{ > + pod_unlock(p2m); > + p2m->defer_nested_flush = false; > + if ( nestedhvm_enabled(p2m->domain) ) > + p2m_flush_nestedp2m(p2m->domain); > +} > > /* > * This function is needed for two reasons: > @@ -514,6 +522,7 @@ p2m_pod_decrease_reservation(struct doma > > gfn_lock(p2m, gfn, order); > pod_lock(p2m); > + p2m->defer_nested_flush = true; > > /* > * If we don't have any outstanding PoD entries, let things take their > @@ -665,7 +674,7 @@ out_entry_check: > } > > out_unlock: > - pod_unlock(p2m); > + pod_unlock_and_flush(p2m); > gfn_unlock(p2m, gfn, order); > return ret; > } > @@ -1144,8 +1153,10 @@ p2m_pod_demand_populate(struct p2m_domai > * won't start until we're done. > */ > if ( unlikely(d->is_dying) ) > - goto out_fail; > - > + { > + pod_unlock(p2m); > + return false; > + } > > /* > * Because PoD does not have cache list for 1GB pages, it has to remap > @@ -1167,6 +1178,8 @@ p2m_pod_demand_populate(struct p2m_domai > p2m_populate_on_demand, p2m->default_access); > } > > + p2m->defer_nested_flush = true; > + > /* Only reclaim if we're in actual need of more cache. */ > if ( p2m->pod.entry_count > p2m->pod.count ) > pod_eager_reclaim(p2m); > @@ -1229,8 +1242,9 @@ p2m_pod_demand_populate(struct p2m_domai > __trace_var(TRC_MEM_POD_POPULATE, 0, sizeof(t), &t); > } > > - pod_unlock(p2m); > + pod_unlock_and_flush(p2m); > return true; > + > out_of_memory: > pod_unlock(p2m);
Don't you need to set defer_nested_flush = false in the out_of_memory label? (as you don't call pod_unlock_and_flush that would do it) Thanks, Roger.