On 05/05/15 13:34, Jan Beulich wrote: >>>> On 30.04.15 at 15:28, <david.vra...@citrix.com> wrote: >> From: Malcolm Crossley <malcolm.cross...@citrix.com> >> >> Performance analysis of aggregate network throughput with many VMs >> shows that performance is signficantly limited by contention on the >> maptrack lock when obtaining/releasing maptrack handles from the free >> list. >> >> Instead of a single free list use a per-VCPU list. This avoids any >> contention when obtaining a handle. Handles must be released back to >> their original list and since this may occur on a different VCPU there >> is some contention on the destination VCPU's free list tail pointer >> (but this is much better than a per-domain lock). >> >> A domain's maptrack limit is multiplied by the number of VCPUs. This >> ensures that a worst case domain that only performs grant table >> operations via one VCPU will not see a lower map track limit. [...] >> + cur_tail = v->maptrack_tail; > > read_atomic()?
It's not required since if this load gets inconsistent state, the cmpxchg loop will just go around once more. I've added the read_atomic() anyway, though. >> @@ -1430,6 +1456,17 @@ gnttab_setup_table( >> gt = d->grant_table; >> write_lock(>->lock); >> >> + /* Tracking of mapped foreign frames table */ >> + if ( (gt->maptrack = xzalloc_array(struct grant_mapping *, >> + max_maptrack_frames * d->max_vcpus)) >> == NULL ) >> + goto out3; > > This surely can easily become an allocation of far more than a page, > and hence needs to be broken up (perhaps using vmap() to map > individually allocated pages). I think there should be a common vzalloc_array() function. Do you agree? This will use xzalloc_array() if the alloc is < PAGE_SIZE to avoid needlessly using vmap space. David _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel