On 03/24/2017 at 01:46 AM, Michael Holzheu wrote: > Am Thu, 23 Mar 2017 17:23:53 +0800 > schrieb Xunlei Pang <xp...@redhat.com>: > >> On 03/23/2017 at 04:48 AM, Michael Holzheu wrote: >>> Am Wed, 22 Mar 2017 12:30:04 +0800 >>> schrieb Dave Young <dyo...@redhat.com>: >>> >>>> On 03/21/17 at 10:18pm, Eric W. Biederman wrote: >>>>> Dave Young <dyo...@redhat.com> writes: >>>>> >>> [snip] >>> >>>>>> I think makedumpfile is using it, but I also vote to remove the >>>>>> CRASHTIME. It is better not to do this while crashing and a makedumpfile >>>>>> userspace patch is needed to drop the use of it. >>>>>> >>>>>>> As we are looking at reliability concerns removing CRASHTIME should make >>>>>>> everything in vmcoreinfo a boot time constant. Which should simplify >>>>>>> everything considerably. >>>>>> It is a nice improvement.. >>>>> We also need to take a close look at what s390 is doing with vmcoreinfo. >>>>> As apparently it is reading it in a different kind of crashdump process. >>>> Yes, need careful review from s390 and maybe ppc64 especially about >>>> patch 2/3, better to have comments from IBM about s390 dump tool and ppc >>>> fadump. Added more cc. >>> On s390 we have at least an issue with patch 1/3. For stand-alone dump >>> and also because we create the ELF header for kdump in the new >>> kernel we save the pointer to the vmcoreinfo note in the old kernel on a >>> defined memory address in our absolute zero lowcore. >>> >>> This is done in arch/s390/kernel/setup.c: >>> >>> static void __init setup_vmcoreinfo(void) >>> { >>> mem_assign_absolute(S390_lowcore.vmcore_info, >>> paddr_vmcoreinfo_note()); >>> } >>> >>> Since with patch 1/3 paddr_vmcoreinfo_note() returns NULL at this point in >>> time we have a problem here. >>> >>> To solve this - I think - we could move the initialization to >>> arch/s390/kernel/machine_kexec.c: >>> >>> void arch_crash_save_vmcoreinfo(void) >>> { >>> VMCOREINFO_SYMBOL(lowcore_ptr); >>> VMCOREINFO_SYMBOL(high_memory); >>> VMCOREINFO_LENGTH(lowcore_ptr, NR_CPUS); >>> mem_assign_absolute(S390_lowcore.vmcore_info, >>> paddr_vmcoreinfo_note()); >>> } >>> >>> Probably related to this is my observation that patch 3/3 leads to >>> an empty VMCOREINFO note for kdump on s390. The note is there ... >>> >>> # readelf -n /var/crash/127.0.0.1-2017-03-22-21:14:39/vmcore | grep VMCORE >>> VMCOREINFO 0x0000068e Unknown note type: (0x00000000) >>> >>> But it contains only zeros. >> Yes, this is a good catch, I will do more tests. > Hello Xunlei, > > After spending some time on this, I now understood the problem: > > In patch 3/3 you copy vmcoreinfo into the control page before > machine_kexec_prepare() is called. For s390 we give back all the > crashkernel memory to the hypervisor before the new crashkernel > is loaded: > > /* > * Give back memory to hypervisor before new kdump is loaded > */ > static int machine_kexec_prepare_kdump(void) > { > #ifdef CONFIG_CRASH_DUMP > if (MACHINE_IS_VM) > diag10_range(PFN_DOWN(crashk_res.start), > PFN_DOWN(crashk_res.end - crashk_res.start + 1)); > return 0; > #else > return -EINVAL; > #endif > } > > So after machine_kexec_prepare_kdump() the contents of your control page > is gone and therefore the vmcorinfo ELF note contains only zeros. > > If you call kimage_crash_copy_vmcoreinfo() after > machine_kexec_prepare_kdump() the problem should be solved for s390.
Will update, thanks for finding the root cause. Regards, Xunlei