Hi Andrew,
On 01/06/2016 22:24, Andrew Cooper wrote:
On 01/06/2016 21:45, Aaron Cornelius wrote:
However, since I only have 1 domain active at a time, I'm not sure why I
should run out of VM IDs.
Sounds like a VMID resource leak. Check to see whether it is freed properly
in domain_destroy().
~Andrew
That would be my assumption. But as far as I can tell, arch_domain_destroy()
calls pwm_teardown() which calls p2m_free_vmid(), and none of the functionality
related to freeing a VM ID appears to have changed in years.
The VMID handling looks suspect. It can be called repeatedly during
domain destruction, and it will repeatedly clear the same bit out of the
vmid_mask.
Can you explain how the p2m_free_vmid can be called multiple time?
We have the following path:
arch_domain_destroy -> p2m_teardown -> p2m_free_vmid.
And I can find only 3 call of arch_domain_destroy we should only be done
once per domain.
If arch_domain_destroy is called multiple time, p2m_free_vmid will not
be the only place where Xen will be in trouble.
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 838d004..7adb39a 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -1393,7 +1393,10 @@ static void p2m_free_vmid(struct domain *d)
struct p2m_domain *p2m = &d->arch.p2m;
spin_lock(&vmid_alloc_lock);
if ( p2m->vmid != INVALID_VMID )
- clear_bit(p2m->vmid, vmid_mask);
+ {
+ ASSERT(test_and_clear_bit(p2m->vmid, vmid_mask));
+ p2m->vmid = INVALID_VMID;
+ }
spin_unlock(&vmid_alloc_lock);
}
Having said that, I can't explain why that bug would result in the
symptoms you are seeing. It is also possibly that your issue is memory
corruption from a separate source.
Can you see about instrumenting p2m_alloc_vmid()/p2m_free_vmid() (with
vmid_alloc_lock held) to see which vmid is being allocated/freed ?
After the initial boot of the system, you should see the same vmid being
allocated and freed for each of your domains.
Looking quickly at the log, the domain is dom1101. However, the number
maximum number of VMID supported is 256, so the exhaustion might be a
race somewhere.
I would be interested to get a reproducer. I wrote a script to cycle a
domain (create/domain) in loop, and I have not seen any issue after 1200
cycles (and counting).
Cheers,
--
Julien Grall
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel