On Wed, Sep 18, 2019 at 7:23 PM Michael Ellerman <m...@ellerman.id.au> wrote: > > Pingfan Liu <kernelf...@gmail.com> writes: > > Cc Kexec list. And keep the original content. > > > > On Thu, Sep 12, 2019 at 10:50 AM Pingfan Liu <kernelf...@gmail.com> wrote: > >> > >> 'mem=" option is an easy way to put high pressure on memory during some > >> test. Hence in stead of total mem, the effective usable memory size > ^ ^ > instead "actual" would be clearer > > I think adding: "after applying the memory limit" > > would help here. > > >> should be considered when reserving mem for crashkernel. Otherwise > >> the boot up may experience oom issue. > ^ > OOM > >> > >> E.g passing > >> crashkernel="2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G", and > >> mem=5G on a 256G machine. > > Spelling out the behaviour before and after would help here, eg: > > .. "would reserve 4G prior to the change and 512M afterward." > Thanks for kindly review. I will update the commit based on your suggestion. > > >> Signed-off-by: Pingfan Liu <kernelf...@gmail.com> > >> Cc: Hari Bathini <hbath...@linux.ibm.com> > >> Cc: Michael Ellerman <m...@ellerman.id.au> > >> To: linuxppc-dev@lists.ozlabs.org > >> --- > >> v1 -> v2: fix the printk info about the total mem > >> arch/powerpc/kernel/machine_kexec.c | 7 ++++--- > >> 1 file changed, 4 insertions(+), 3 deletions(-) > >> > >> diff --git a/arch/powerpc/kernel/machine_kexec.c > >> b/arch/powerpc/kernel/machine_kexec.c > >> index c4ed328..eec96dc 100644 > >> --- a/arch/powerpc/kernel/machine_kexec.c > >> +++ b/arch/powerpc/kernel/machine_kexec.c > >> @@ -114,11 +114,12 @@ void machine_kexec(struct kimage *image) > >> > >> void __init reserve_crashkernel(void) > >> { > >> - unsigned long long crash_size, crash_base; > >> + unsigned long long crash_size, crash_base, total_mem_sz; > >> int ret; > >> > >> + total_mem_sz = memory_limit ? memory_limit : > >> memblock_phys_mem_size(); > >> /* use common parsing */ > >> - ret = parse_crashkernel(boot_command_line, > >> memblock_phys_mem_size(), > >> + ret = parse_crashkernel(boot_command_line, total_mem_sz, > >> &crash_size, &crash_base); > > I think this change makes sense. But we have multiple arches that > implement similar logic, and I wonder if we should keep them all the > same. > > eg: > > arch/arm/kernel/setup.c: ret = > parse_crashkernel(boot_command_line, total_mem, > arch/arm64/mm/init.c: ret = > parse_crashkernel(boot_command_line, memblock_phys_mem_size(), > arch/ia64/kernel/setup.c: ret = > parse_crashkernel(boot_command_line, total, > arch/mips/kernel/setup.c: ret = > parse_crashkernel(boot_command_line, total_mem, > arch/powerpc/kernel/fadump.c: ret = > parse_crashkernel(boot_command_line, memblock_phys_mem_size(), > arch/powerpc/kernel/machine_kexec.c: ret = > parse_crashkernel(boot_command_line, memblock_phys_mem_size(), > arch/s390/kernel/setup.c: rc = > parse_crashkernel(boot_command_line, memory_end, &crash_size, > arch/sh/kernel/machine_kexec.c: ret = > parse_crashkernel(boot_command_line, memblock_phys_mem_size(), > arch/x86/kernel/setup.c: ret = > parse_crashkernel(boot_command_line, total_mem, &crash_size, &crash_base); > > > From a quick glance most of them don't seem to take the memory limit > into account. > > So I guess the question is do we want all arches to implement the same > behaviour or do we think it doesn't matter if they differ in details > like this?
On powerpc, the current code make fadump/kdump a higher priority than "mem=" option, as the notes in fadump_reserve_mem() says " /* * Calculate the memory boundary. * If memory_limit is less than actual memory boundary then reserve * the memory for fadump beyond the memory_limit and adjust the * memory_limit accordingly, so that the running kernel can run with * specified memory_limit. */ " While on other archs, they pack "mem=" info into memblock before calling memblock_phys_mem_size(). So when parse_crashkernel() calls memblock_phys_mem_size(), the "mem=" takes effect. E.g for x86 in arch/x86/kernel/e820.c static int __init parse_memopt(char *p) { ... e820__range_remove(mem_size, ULLONG_MAX - mem_size, E820_TYPE_RAM, 1); // this pack the "mem=" info into e820, and is finally feed to memblock } Thanks, Pingfan