offline

Gu Zheng Sun, 01 Mar 2015 18:01:11 -0800

Hi numa guys,

Yasuaki Ishimatsu found a phenomenon that the numa mapping (cpu<->node 
relationship)
changed when hot add/remove node. 
And this change will cause allocation failure bug to workqueue sub-system:
...
  SLUB: Unable to allocate memory on node 2 (gfp=0x80d0)
  cache: kmalloc-192, object size: 192, buffer size: 192, default order: 1, min 
order: 0
  node 0: slabs: 6172, objs: 259224, free: 245741
  node 1: slabs: 3261, objs: 136962, free: 127656
...


It happened in the following situation:

1) System Node/CPU before offline/online:
               | CPU
        ------------------------
        node 0 |  0-14, 60-74
        node 1 | 15-29, 75-89
        node 2 | 30-44, 90-104
        node 3 | 45-59, 105-119

2) A system-board (contains node2 and node3) is offline:
               | CPU
        ------------------------
        node 0 |  0-14, 60-74
        node 1 | 15-29, 75-89

3) A new system-board (also contains two nodes) is online, two new node
   IDs are allocated for the two node of the SB, but the old CPU IDs
   are allocated for the SB, here the NUMA mapping between node and CPU
   is changed.
   (the node of CPU#30 is changed from node#2 to node#4, for example)
               | CPU
        ------------------------
        node 0 |  0-14, 60-74
        node 1 | 15-29, 75-89
        node 4 | 30-59
        node 5 | 90-119

4) now, the NUMA mapping is changed.

So the question is *why the NUMA mapping needs to be changed?*
We can reuse the free CPU IDs for new cpus, why not reuse the free
node IDs and keep the mapping the same as before?

Looking forward to your response, thanks.

Best regards,
Gu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Question] cpu<->node relationship changed with node online/offline

Reply via email to