Re: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest

Boris Ostrovsky Wed, 22 Jul 2015 08:34:57 -0700

On 07/22/2015 10:50 AM, Dario Faggioli wrote:

On Wed, 2015-07-22 at 16:09 +0200, Juergen Gross wrote:

On 07/22/2015 03:58 PM, Boris Ostrovsky wrote:

What if I configure a guest to follow HW topology? I.e. I pin VCPUs to
appropriate cores/threads? With elfnote I am stuck with disabled topology.

Add an option to do exactly that: follow HW topology (pin vcpus,
configure vnuma)?

I thought about configuring things in such a way that they match the
host topology, as Boris is suggesting, too. And in that case, I think
arranging for doing so in toolstack, if PV vNUMA is identified (as I
think Juergen is suggesting) seems a good approach.

However, when I try to do that on my box, manually, but I don't seem to
be able to.

Here's what I tried. Since I have this host topology:
cpu_topology           :
cpu:    core    socket     node
   0:       0        1        0
   1:       0        1        0
   2:       1        1        0
   3:       1        1        0
   4:       9        1        0
   5:       9        1        0
   6:      10        1        0
   7:      10        1        0
   8:       0        0        1
   9:       0        0        1
  10:       1        0        1
  11:       1        0        1
  12:       9        0        1
  13:       9        0        1
  14:      10        0        1
  15:      10        0        1

I configured the guest like this:
vcpus       = '4'
memory      = '1024'
vnuma = [ [ "pnode=0","size=512","vcpus=0-1","vdistances=10,20"  ],
           [ "pnode=1","size=512","vcpus=2-3","vdistances=20,10"  ] ]
cpus=["0","1","8","9"]

This means vcpus 0 and 1, which are assigned to vnode 0, are pinned to
pcpu 0 and 1, which are siblings, per the host topology.
Similarly, vcpus 2 and 3, assigned to vnode 1, are assigned to two
siblings pcpus on pnode 1.

This seems to be honoured:
# xl vcpu-list 4
Name                                ID  VCPU   CPU State   Time(s) Affinity 
(Hard / Soft)
test                                 4     0    0   -b-      10.9  0 / 0-7
test                                 4     1    1   -b-       7.6  1 / 0-7
test                                 4     2    8   -b-       0.1  8 / 8-15
test                                 4     3    9   -b-       0.1  9 / 8-15

And yet, no joy:
# ssh root@192.168.1.101 "yes > /dev/null 2>&1 &"
# ssh root@192.168.1.101 "yes > /dev/null 2>&1 &"
# ssh root@192.168.1.101 "yes > /dev/null 2>&1 &"
# ssh root@192.168.1.101 "yes > /dev/null 2>&1 &"
# xl vcpu-list 4
Name                                ID  VCPU   CPU State   Time(s) Affinity 
(Hard / Soft)
test                                 4     0    0   r--      16.4  0 / 0-7
test                                 4     1    1   r--      12.5  1 / 0-7
test                                 4     2    8   -b-       0.2  8 / 8-15
test                                 4     3    9   -b-       0.1  9 / 8-15

So, what am I doing wrong at "following the hw topology"?

Besides, this is not necessarily a NUMA-only issue, it's a scheduling
one (inside the guest) as well.

Sure. That's what Jan said regarding SUSE's xen-kernel. No toplogy info
(or a trivial one) might be better than the wrong one...

Yep. Exacty. As Boris says, this is a generic scheduling issue, although
it's tru that it's only (as far as I can tell) with vNUMA that it bite
us so hard...

I am not sure that it's only vNUMA. It's just that with vNUMA we can seea warning (on your system) that something goes wrong. In other cases(like scheduling, or sizing objects based on discovered cache sizes) wedon't see anything in the log but system/programs are making wrongdecisions. (And your results above may well be the example of that)


-boris

I mean, performance are always going to be inconsistent,
but it's only in that case that you basically _loose_ some of the
vcpus! :-O

Dario



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest

Reply via email to