Nathan Fontenot <nf...@linux.vnet.ibm.com> writes: > On 09/19/2018 11:38 PM, Michael Ellerman wrote: >> Nathan Fontenot <nf...@linux.vnet.ibm.com> writes: >>> When removing memory we need to remove the memory from the node >>> it was added to instead of looking up the node it should be in >>> in the device tree. >>> >>> During testing we have seen scenarios where the affinity for a >>> LMB changes due to a partition migration or PRRN event. In these >>> cases the node the LMB exists in may not match the node the device >>> tree indicates it belongs in. This can lead to a system crash >>> when trying to DLAPR remove the LMB after a migration or PRRN >>> event. The current code looks up the node in the device tree to >>> remove the LMB from, the crash occurs when we try to offline this >>> node and it does not have any data, i.e. node_data[nid] == NULL. >> >> This isn't building for 32-bit etc: >> >> arch/powerpc/mm/drmem.c: In function 'init_drmem_v1_lmbs': >> arch/powerpc/mm/drmem.c:371:14: error: implicit declaration of function >> 'memory_add_physaddr_to_nid' [-Werror=implicit-function-declaration] >> lmb->nid = memory_add_physaddr_to_nid(lmb->base_addr); >> ^ >> cc1: all warnings being treated as errors >> scripts/Makefile.build:317: recipe for target 'arch/powerpc/mm/drmem.o' >> failed >> >> See the failed checks here: >> https://patchwork.ozlabs.org/patch/969150/ >> >> >> Probably drmem.c should only be compiled for 64-bit NUMA etc. > > Looks like the root cause is that memory hotplug relies on sparsemem which > is not supported on 32-bit.
Yeah that could be it. Making drmem.c built just for MEMORY_HOTPLUG would make sense. > This patch is also going to need a refresh to apply cleanly due to other > patches that have gone in. I'll re-submit after looking at the build break > issues more. OK thanks. cheers