On 11/09/2018 08:45 PM, Qian Cai wrote: >> Sent: Friday, November 09, 2018 at 5:08 PM >> From: "Waiman Long" <long...@redhat.com> >> To: "Qian Cai" <c...@gmx.us>, "Yang Shi" <yang....@linux.alibaba.com> >> Cc: "open list" <linux-kernel@vger.kernel.org>, "Thomas Gleixner" >> <t...@linutronix.de>, "Arnd Bergmann" <a...@arndb.de>, "Joel Fernandes >> (Google)" <j...@joelfernandes.org>, "Zhong Jiang" <zhongji...@huawei.com> >> Subject: Re: ODEBUG: Out of memory. ODEBUG disabled >> >> On 11/09/2018 04:51 PM, Qian Cai wrote: >>>> On Nov 9, 2018, at 4:42 PM, Yang Shi <yang....@linux.alibaba.com> wrote: >>>> >>>> >>>> >>>> On 11/9/18 1:36 PM, Qian Cai wrote: >>>>> It is a bit annoying on this aarch64 server with 64 CPUs that is >>>>> booting the latest mainline (3541833fd1f2) causes object debugging >>>>> always running out of memory. >>>> May you please paste the detail failure log? >>> I assume you mean dmesg. >>> >>> Here is the dmesg for 64 CPUs, >>> https://paste.ubuntu.com/p/BnhvXXhn7k/ >>>>> I have to boot the kernel with only 16 CPUs instead (nr_cpus=16) >>>>> to make it work. Is it expected that object debugging is not going >>>>> to work with large machines? >>>> I don't think so. I'm supposed it works well with large CPU number on x86. >>> Here is the one with nr_cpus workaround, >>> https://paste.ubuntu.com/p/qMpd2CCPSV/ >> The debugobjects code have a set of 1024 statically allocated debug >> objects that can be used in early boot before the slab memory allocator >> is initialized. Apparently, the system may have used up all the >> statically allocated objects. Try double ODEBUG_POOL_SIZE to see if it >> helps. > Great, you are right. Doubling the size makes it work. Does it make sense > to have a kconfig option instead?
First, I think you need to figure out what your system needed to use up so many debug objects in early boot. If there is a legitimate reason for this behavior, we can talk about having a kconfig option to increase that. >> There are also quite a number of warnings in your console log. So there >> is certainly something wrong with your kernel or config options. > Yes, I am working on all those warnings. This one is found by ODEBUG, > https://lkml.org/lkml/2018/11/10/136 Cheers, Longman