On Mon, Apr 11, 2022 at 11:47:46AM +0200, Jan Beulich wrote: > While it is okay for IOMMU page tables to be set up for guests starting > in PoD mode, actual device assignment may only occur once all PoD > entries have been removed from the P2M. So far this was enforced only > for boot-time assignment, and only in the tool stack. > > Also use the new function to replace p2m_pod_entry_count(): Its unlocked > access to p2m->pod.entry_count wasn't really okay (irrespective of the > result being stale by the time the caller gets to see it). Nor was the > use of that function in line with the immediately preceding comment: A > PoD guest isn't just one with a non-zero entry count, but also one with > a non-empty cache (e.g. prior to actually launching the guest). > > To allow the tool stack to see a consistent snapshot of PoD state, move > the tail of XENMEM_{get,set}_pod_target handling into a function, adding > proper locking there. > > In libxl take the liberty to use the new local variable r also for a > pre-existing call into libxc. > > Signed-off-by: Jan Beulich <jbeul...@suse.com>
Reviewed-by: Roger Pau Monné <roger....@citrix.com> Just one comment below. > --- > If p2m->pod.entry_count == p2m->pod.count it is in principle possible to > permit device assignment by actively resolving all remaining PoD entries. > > Initially I thought this was introduced by f89f555827a6 ("remove late > (on-demand) construction of IOMMU page tables"), but without > arch_iommu_use_permitted() checking for PoD I think the issue has been > there before that. > --- > v4: Drop tool stack side change (superseded by 07449ecfa425). Extend VM > event related paragraph of description. > v3: In p2m_pod_set_mem_target() move check down. > v2: New. > > --- a/xen/arch/x86/mm/p2m-pod.c > +++ b/xen/arch/x86/mm/p2m-pod.c > @@ -20,6 +20,7 @@ > */ > > #include <xen/event.h> > +#include <xen/iocap.h> > #include <xen/ioreq.h> > #include <xen/mm.h> > #include <xen/sched.h> > @@ -360,7 +361,10 @@ p2m_pod_set_mem_target(struct domain *d, > > ASSERT( pod_target >= p2m->pod.count ); > > - ret = p2m_pod_set_cache_target(p2m, pod_target, 1/*preemptible*/); > + if ( has_arch_pdevs(d) || cache_flush_permitted(d) ) > + ret = -ENOTEMPTY; ENOTEMPTY seems weird here. I think the reasoning is that the set of passthrough devices is not empty? IMO it's confusing as the function itself is related to buffer management, so returning ENOTEMPTY could be confused with some other condition. Might be less ambiguous to use EXDEV. Thanks, Roger.