Private bug reported: I've been trying to run the server certification tools on a server with DPCMM devices that are configured as: 15% MemoryMode, 85% AppDirect mode.
I've configured a fsdax, devdax, sector and raw devices on the DCPMMs. fsdax, devdax and sector are all mountable, formatted etc. The test runs CKing's stress-ng disk tests against the DCPMM storage devices, and after so long, the entire server abends and resets. This is the exact same test we run on all servers and it never causes this sort of behaviour. So the likely issues: 1: Some odditiy in testing the AppDirect storage devices on DCPMMs. 2: Some Kernel thing that's not able to deal with high I/O loads on DCPMMs. I'm currently in the middle of recreating this abit easier, so I'll provide directions soon. ProblemType: Bug DistroRelease: Ubuntu 18.04 Package: linux-image-4.15.0-54-generic 4.15.0-54.58 ProcVersionSignature: User Name 4.15.0-54.58-generic 4.15.18 Uname: Linux 4.15.0-54-generic x86_64 AlsaDevices: total 0 crw-rw---- 1 root audio 116, 1 Jul 2 04:41 seq crw-rw---- 1 root audio 116, 33 Jul 2 04:41 timer AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay' ApportVersion: 2.20.9-0ubuntu7.6 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord' AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: Date: Tue Jul 2 05:37:17 2019 IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig' Lsusb: Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 001 Device 003: ID 0b1f:03e9 Insyde Software Corp. Bus 001 Device 002: ID 0000:0001 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub MachineType: Intel Corporation S2600WFD PciMultimedia: ProcEnviron: TERM=xterm-256color PATH=(custom, no user) XDG_RUNTIME_DIR=<set> LANG=C.UTF-8 SHELL=/bin/bash ProcFB: 0 astdrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-54-generic root=UUID=d8f7444e-3965-49ba-bc42-628bc368893a ro RelatedPackageVersions: linux-restricted-modules-4.15.0-54-generic N/A linux-backports-modules-4.15.0-54-generic N/A linux-firmware 1.173.6 RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill' SourcePackage: linux UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 02/27/2019 dmi.bios.vendor: Intel Corporation dmi.bios.version: SE5C620.86B.0D.01.0395.022720191340 dmi.board.asset.tag: Base Board Asset Tag dmi.board.name: S2600WFD dmi.board.vendor: Intel Corporation dmi.board.version: J46732-610 dmi.chassis.asset.tag: .................... dmi.chassis.type: 23 dmi.chassis.vendor: ............................... dmi.chassis.version: .................. dmi.modalias: dmi:bvnIntelCorporation:bvrSE5C620.86B.0D.01.0395.022720191340:bd02/27/2019:svnIntelCorporation:pnS2600WFD:pvr....................:rvnIntelCorporation:rnS2600WFD:rvrJ46732-610:cvn...............................:ct23:cvr..................: dmi.product.family: Family dmi.product.name: S2600WFD dmi.product.version: .................... dmi.sys.vendor: Intel Corporation ** Affects: linux (Ubuntu) Importance: Undecided Status: New ** Tags: amd64 apport-bug bionic uec-images ** Information type changed from Public to Public Security ** Information type changed from Public Security to Private -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1834990 Title: Cascade Lake system with DCPMM devices abends under stress Status in linux package in Ubuntu: New Bug description: I've been trying to run the server certification tools on a server with DPCMM devices that are configured as: 15% MemoryMode, 85% AppDirect mode. I've configured a fsdax, devdax, sector and raw devices on the DCPMMs. fsdax, devdax and sector are all mountable, formatted etc. The test runs CKing's stress-ng disk tests against the DCPMM storage devices, and after so long, the entire server abends and resets. This is the exact same test we run on all servers and it never causes this sort of behaviour. So the likely issues: 1: Some odditiy in testing the AppDirect storage devices on DCPMMs. 2: Some Kernel thing that's not able to deal with high I/O loads on DCPMMs. I'm currently in the middle of recreating this abit easier, so I'll provide directions soon. ProblemType: Bug DistroRelease: Ubuntu 18.04 Package: linux-image-4.15.0-54-generic 4.15.0-54.58 ProcVersionSignature: User Name 4.15.0-54.58-generic 4.15.18 Uname: Linux 4.15.0-54-generic x86_64 AlsaDevices: total 0 crw-rw---- 1 root audio 116, 1 Jul 2 04:41 seq crw-rw---- 1 root audio 116, 33 Jul 2 04:41 timer AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay' ApportVersion: 2.20.9-0ubuntu7.6 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord' AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: Date: Tue Jul 2 05:37:17 2019 IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig' Lsusb: Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 001 Device 003: ID 0b1f:03e9 Insyde Software Corp. Bus 001 Device 002: ID 0000:0001 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub MachineType: Intel Corporation S2600WFD PciMultimedia: ProcEnviron: TERM=xterm-256color PATH=(custom, no user) XDG_RUNTIME_DIR=<set> LANG=C.UTF-8 SHELL=/bin/bash ProcFB: 0 astdrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-54-generic root=UUID=d8f7444e-3965-49ba-bc42-628bc368893a ro RelatedPackageVersions: linux-restricted-modules-4.15.0-54-generic N/A linux-backports-modules-4.15.0-54-generic N/A linux-firmware 1.173.6 RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill' SourcePackage: linux UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 02/27/2019 dmi.bios.vendor: Intel Corporation dmi.bios.version: SE5C620.86B.0D.01.0395.022720191340 dmi.board.asset.tag: Base Board Asset Tag dmi.board.name: S2600WFD dmi.board.vendor: Intel Corporation dmi.board.version: J46732-610 dmi.chassis.asset.tag: .................... dmi.chassis.type: 23 dmi.chassis.vendor: ............................... dmi.chassis.version: .................. dmi.modalias: dmi:bvnIntelCorporation:bvrSE5C620.86B.0D.01.0395.022720191340:bd02/27/2019:svnIntelCorporation:pnS2600WFD:pvr....................:rvnIntelCorporation:rnS2600WFD:rvrJ46732-610:cvn...............................:ct23:cvr..................: dmi.product.family: Family dmi.product.name: S2600WFD dmi.product.version: .................... dmi.sys.vendor: Intel Corporation To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1834990/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp