I am seeing this with linux-image-6.8.0-40-generic version
6.8.0-40.40~22.04.3 on a Dell PowerEdge R750xa with 384G of ECC RAM
running BTRFS for all storage.

It crashes like this every day or two with that kernel but on linux-
image-6.5.0-28-generic version 6.5.0-28.29~22.04.1 it is solid and has
never crashed. I think it's the same issue as this.

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2056706

Also this bug might be the same issue.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2080039

Title:
  Kernel BUG: Bad page state in process kswapd0

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Noble:
  Fix Released

Bug description:
  Since installing 24.04 two months ago, I've experienced a few random
  full-system freezes that required a hard-reset to recover. Up until
  now, I was not able to find the cause - plugging in a monitor to the
  system would just display nothing, and the journal logs would just
  stop abruptly.

  My first instinct was bad memory, so after it happened last week I ran
  memtest for several hours, but it did not find any memory errors.

  However I now believe I have found the actual cause, because it just
  happened again and luckily this time the journal saved the start of a
  kernel BUG message:

  BUG: Bad page state in process kswapd0  pfn:3f053e
  page:000000000f35bcf8 refcount:0 mapcount:0 mapping:000000000e24c844 
index:0x2bcbd pfn:0x3f053e
  aops:btree_aops [btrfs] ino:1
  flags: 0x17ffffc0000008(uptodate|node=0|zone=2|lastcpupid=0x1fffff)
  page_type: 0xffffffff()

  After some digging, I found this kernel bug report:
  
https://lore.kernel.org/lkml/CABXGCsPktcHQOvKTbPaTwegMExije=Gpgci5NW=hqoro-s7...@mail.gmail.com/

  that appears to describe the exact same bug (I am also using btrfs as
  the root partition, and my swap file is also on that btrfs
  filesystem).

  Then I also found this kernel patch:
  
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=f3a5367c679d31473d3fbb391675055b4792c309

  that appears to be a fix for the above bug.

  To try to check if this fix is present in my kernel (no idea if this
  is valid), I installed the linux-source package, extracted the archive
  in /usr/src/linux-source-6.8.0, and checked the file modified by the
  patch mentioned above - and the changes do not appear to have been
  made.

  So if the patch has not been applied, could this please be done? If it
  has actually been applied, then this is some other bug and I need to
  do more investigation...

  For the time being I have disabled swap to hopefully try and avoid the
  crash.

  # uname -a
  Linux server 6.8.0-41-generic #41-Ubuntu SMP PREEMPT_DYNAMIC Fri Aug  2 
20:41:06 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

  # cat /proc/version_signature
  Ubuntu 6.8.0-41.41-generic 6.8.12

  # lsb_release -rd
  No LSB modules are available.
  Description:    Ubuntu 24.04.1 LTS
  Release:        24.04

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2080039/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to