Private bug reported:

I've been trying to run the server certification tools on a server with
DPCMM devices that are configured as:  15% MemoryMode, 85% AppDirect
mode.

I've configured a fsdax, devdax, sector and raw devices on the DCPMMs.
fsdax, devdax and sector are all mountable, formatted etc.

The test runs CKing's stress-ng disk tests against the DCPMM storage
devices, and after so long, the entire server abends and resets.

This is the exact same test we run on all servers and it never causes
this sort of behaviour. So the likely issues:

1: Some odditiy in testing the AppDirect storage devices on DCPMMs.
2: Some Kernel thing that's not able to deal with high I/O loads on DCPMMs.


I'm currently in the middle of recreating this abit easier, so I'll provide 
directions soon.

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: linux-image-4.15.0-54-generic 4.15.0-54.58
ProcVersionSignature: User Name 4.15.0-54.58-generic 4.15.18
Uname: Linux 4.15.0-54-generic x86_64
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116,  1 Jul  2 04:41 seq
 crw-rw---- 1 root audio 116, 33 Jul  2 04:41 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
ApportVersion: 2.20.9-0ubuntu7.6
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
Date: Tue Jul  2 05:37:17 2019
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
Lsusb:
 Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
 Bus 001 Device 003: ID 0b1f:03e9 Insyde Software Corp. 
 Bus 001 Device 002: ID 0000:0001  
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
MachineType: Intel Corporation S2600WFD
PciMultimedia:
 
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=C.UTF-8
 SHELL=/bin/bash
ProcFB: 0 astdrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-54-generic 
root=UUID=d8f7444e-3965-49ba-bc42-628bc368893a ro
RelatedPackageVersions:
 linux-restricted-modules-4.15.0-54-generic N/A
 linux-backports-modules-4.15.0-54-generic  N/A
 linux-firmware                             1.173.6
RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 02/27/2019
dmi.bios.vendor: Intel Corporation
dmi.bios.version: SE5C620.86B.0D.01.0395.022720191340
dmi.board.asset.tag: Base Board Asset Tag
dmi.board.name: S2600WFD
dmi.board.vendor: Intel Corporation
dmi.board.version: J46732-610
dmi.chassis.asset.tag: ....................
dmi.chassis.type: 23
dmi.chassis.vendor: ...............................
dmi.chassis.version: ..................
dmi.modalias: 
dmi:bvnIntelCorporation:bvrSE5C620.86B.0D.01.0395.022720191340:bd02/27/2019:svnIntelCorporation:pnS2600WFD:pvr....................:rvnIntelCorporation:rnS2600WFD:rvrJ46732-610:cvn...............................:ct23:cvr..................:
dmi.product.family: Family
dmi.product.name: S2600WFD
dmi.product.version: ....................
dmi.sys.vendor: Intel Corporation

** Affects: linux (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: amd64 apport-bug bionic uec-images

** Information type changed from Public to Public Security

** Information type changed from Public Security to Private

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1834990

Title:
  Cascade Lake system with DCPMM devices abends under stress

Status in linux package in Ubuntu:
  New

Bug description:
  I've been trying to run the server certification tools on a server
  with DPCMM devices that are configured as:  15% MemoryMode, 85%
  AppDirect mode.

  I've configured a fsdax, devdax, sector and raw devices on the DCPMMs.
  fsdax, devdax and sector are all mountable, formatted etc.

  The test runs CKing's stress-ng disk tests against the DCPMM storage
  devices, and after so long, the entire server abends and resets.

  This is the exact same test we run on all servers and it never causes
  this sort of behaviour. So the likely issues:

  1: Some odditiy in testing the AppDirect storage devices on DCPMMs.
  2: Some Kernel thing that's not able to deal with high I/O loads on DCPMMs.

  
  I'm currently in the middle of recreating this abit easier, so I'll provide 
directions soon.

  ProblemType: Bug
  DistroRelease: Ubuntu 18.04
  Package: linux-image-4.15.0-54-generic 4.15.0-54.58
  ProcVersionSignature: User Name 4.15.0-54.58-generic 4.15.18
  Uname: Linux 4.15.0-54-generic x86_64
  AlsaDevices:
   total 0
   crw-rw---- 1 root audio 116,  1 Jul  2 04:41 seq
   crw-rw---- 1 root audio 116, 33 Jul  2 04:41 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
  ApportVersion: 2.20.9-0ubuntu7.6
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 
'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  Date: Tue Jul  2 05:37:17 2019
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
  Lsusb:
   Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
   Bus 001 Device 003: ID 0b1f:03e9 Insyde Software Corp. 
   Bus 001 Device 002: ID 0000:0001  
   Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
  MachineType: Intel Corporation S2600WFD
  PciMultimedia:
   
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=<set>
   LANG=C.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 astdrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-54-generic 
root=UUID=d8f7444e-3965-49ba-bc42-628bc368893a ro
  RelatedPackageVersions:
   linux-restricted-modules-4.15.0-54-generic N/A
   linux-backports-modules-4.15.0-54-generic  N/A
   linux-firmware                             1.173.6
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
  SourcePackage: linux
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 02/27/2019
  dmi.bios.vendor: Intel Corporation
  dmi.bios.version: SE5C620.86B.0D.01.0395.022720191340
  dmi.board.asset.tag: Base Board Asset Tag
  dmi.board.name: S2600WFD
  dmi.board.vendor: Intel Corporation
  dmi.board.version: J46732-610
  dmi.chassis.asset.tag: ....................
  dmi.chassis.type: 23
  dmi.chassis.vendor: ...............................
  dmi.chassis.version: ..................
  dmi.modalias: 
dmi:bvnIntelCorporation:bvrSE5C620.86B.0D.01.0395.022720191340:bd02/27/2019:svnIntelCorporation:pnS2600WFD:pvr....................:rvnIntelCorporation:rnS2600WFD:rvrJ46732-610:cvn...............................:ct23:cvr..................:
  dmi.product.family: Family
  dmi.product.name: S2600WFD
  dmi.product.version: ....................
  dmi.sys.vendor: Intel Corporation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1834990/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to