Hello Kowshik, thanks for the verification, even if the result is not
like expected   :-/

I double checked if the patch is properly included:
~/ubuntu-noble-master-next/noble-clean$ git log --oneline --grep 
"powerpc/64s/radix/kfence: map __kfence_pool at page granularity"
ec65624fc069 powerpc/64s/radix/kfence: map __kfence_pool at page granularity
~/ubuntu-noble-master-next/noble-clean$ git tag --contains ec65624fc069
Ubuntu-6.8.0-48.48
And it is.

And we have currently set CONFIG_KFENCE to yes for all architectures, incl. 
ppc64el:
grep -ri CONFIG_KFENCE\  debian.master/*
debian.master/config/annotations:CONFIG_KFENCE                                  
 policy<{'amd64': 'y', 'arm64': 'y', 'armhf': 'y', 'ppc64el': 'y', 'riscv64': 
'y', 's390x': 'y'}>

Oh dear, so that is (according to the Ubuntu SRU terms) first of all a
"verification-failed".


Either the proposed fix (to cherry-pick "powerpc/64s/radix/kfence: map 
__kfence_pool at page granularity") is either not fixing the situation like 
expected, or there is more needed (like for example setting CONFIG_KFENCE to 
'n' for ppc64el only)?!

Is a new root cause analysis for this issue needed?
And shall we pull the commit again out of the Ubuntu kernel (which is tricky, 
since we are already late in the SRU cycle)?
@IBM what do you think?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2060039

Title:
  [Ubuntu-24.04] FADump with recommended crash size is making the L1
  hang

Status in The Ubuntu-power-systems project:
  Fix Committed
Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Noble:
  Fix Committed
Status in linux source package in Oracular:
  Fix Released

Bug description:
  SRU Justification:

  [Impact]
   * L1 host hangs when triggering FADump that results in crash

  [Fix]
   * 353d7a84c214f184d5a6b62acdec8b4424159b7c 353d7a84c214 
"powerpc/64s/radix/kfence: map __kfence_pool at page granularity"

  [Test Case]
   * Have a Ubuntu Server 24.04 LTS installation on ppc64el.
   * Enable FADump with 1GB: fadump=on crashkernel=1024M
   * A kernel panic will happen when dump got triggered

  [Regression Potential]
  * There is a certain risk of a regression, but it is mapping only the memory
    allocated for KFENCE pool at page granularity, reducing memory consumption
    when KFENCE is used.

  * On top the commit is already upstream reviewed and accepted.

  * The modifications were done and tested by IBM.

  * The fadump feature is supported only on IBM POWER systems.

  [Other]
  * The fix/commit got upstream accepted with kernel v6.11-rc4,
    hence Oracular (with a planned kernel of 6.11) is not affected.

  .......................

  Problem description :
  ======================

  Triggered FADump with the recommended crash. L1 host got hung.

  As per the public document
  https://wiki.ubuntu.com/ppc64el/Recommendations recommended crash
  kernel size is 1024M for the system. But with 1024M and 2048M, the L1
  is getting hanged. with 4096, crash is generated and collected.

  root@ubuntu2404:~# uname -ar
  Linux ubuntu2404 6.8.0-11-generic #11-Ubuntu SMP Wed Feb 14 00:33:03 UTC 2024 
ppc64le ppc64le ppc64le GNU/Linux

  root@ubuntu2404:~# free -h
                 total        used        free      shared  buff/cache   
available
  Mem:            48Gi       1.7Gi        46Gi        13Mi       687Mi        
46Gi
  Swap:          8.0Gi          0B       8.0Gi

  root@ubuntu2404:~# cat /proc/cmdline
  BOOT_IMAGE=/vmlinux-6.8.0-11-generic root=/dev/mapper/ubuntu--vg-ubuntu--lv 
ro fadump=on crashkernel=1024M

  root@ubuntu2404:~# dmesg | grep -i reser
  [    0.000000] fadump: Reserved 1024MB of memory at 0x00000040000000 (System 
RAM: 51200MB)
  [    0.000000] fadump: Initialized 0x40000000 bytes cma area at 1024MB from 
0x40070000 bytes of memory reserved for firmware-assisted dump
  [    0.000000] Memory: 49316672K/52428800K available (23616K kernel code, 
4096K rwdata, 25536K rodata, 8832K init, 2487K bss, 2063552K reserved, 1048576K 
cma-reserved)
  [    0.396408] ibmvscsi 30000066: Client reserve enabled

  root@ubuntu2404:~# kdump-config show
  DUMP_MODE:            fadump
  USE_KDUMP:            1
  KDUMP_COREDIR:                /var/crash
     /var/lib/kdump/vmlinuz
  kdump initrd:
     /var/lib/kdump/initrd.img
  current state:    ready to fadump

  IBM is looking to update the crash kernel reservations section of the
  wiki for Power.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/2060039/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to