TL;DR @seth-arnold, as a test can you try to set the following options?

  $ echo $((32 * 1024 * 1024)) | sudo tee /proc/sys/vm/dirty_bytes
  $ echo $((32 * 1024 * 1024)) | sudo tee /proc/sys/vm/dirty_background_bytes

Repeat the test and see if the system is still unresponsive.

Details below.

This is what I think it's happening in this last scenario: interactive
performance killed when a large I/O writer is running.

The large I/O writer generates a lot of dirty pages, nothing is forcing
to sync those pages to the backing store until the dirty_ratio (=20%) /
dirty_background_ratio (=10%) thresholds are hit. And they can be quite
high with the default settings in systems with a lot of RAM.

For example in a system with 16GB of free/reclaimable memory, the amount
of dirty memory that is allowed before a writer is actively forced to
flush those pages to the backing store is: 16GB * 20 / 100 = 3.2GB.
Flusher threads are started when the amount of dirty pages is 16GB * 10
/ 100 = 1.6GB of dirty memory.

So, if the writer doesn't stop, it will consume all the free pages in
the system and at that point we are going to have a lot of dirty pages.
Then the kernel needs to decide what to do to free up some pages.

Reclaimable memory is the first choice: cached clean pages that already
have a copy on the corresponding backing store are easy to reclaim,
because they just need to be dropped from the page cache (no I/O
involved). Dirty pages are more expensive to reclaim, because they need
to be flushed to the backing store before freeing up the page. Same with
anonymous memory that needs to be flushed to the swap device, before
being able to re-use the page.

So when the system starts to reclaim some pages, we see some swap
activity and we also see some I/O due to the flushing of the dirty
pages. I think the system becomes sluggish, because there are too many
dirty pages, the kernel is spending too much time to select the right
pages to reclaim and interactive performance is killed.

This looks like a bug/regression in the kernel and I think we should
definitely investigate more and track down the reason of the problem. In
the meantime, as a test to prove this thoery I think we could try to
reduce the amount of allowed dirty pages in the system, tuning the dirty
thresholds: vm.dirty_bytes and vm.dirty_background_bytes (using the
*_bytes tuners to have a more fine-grained control on those thresholds)
and see if there are some benefits in the specific scenario reported by
Seth.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1861359

Title:
  swap storms kills interactive use

Status in linux package in Ubuntu:
  Confirmed
Status in linux source package in Focal:
  Confirmed

Bug description:
  [Impact]

  High watermark boosting can cause large swap activity under certain
  memory intensive workloads, making the system very unresponsive
  (screen does not refresh, keyboard not responding, etc.).

  This large swap activity seems to be prevented disabling high
  watermark boosting.

  [Test case]

  Opening this web page in chrome seems to be a good reproducer of the
  problem:

  
https://platform.leolabs.space/visualizations/conjunction?type=conjunction&reportId=2004981040

  When this page is opened we can clearly see from 'top' (for example)
  that the used swap is going up very quickly.

  With the fix applied swap is not used at all and the system is always
  responsive.

  [Fix]

  Set vm.watermark_boost_factor to 0, disabling watermark boosting by
  default.

  [Regression potential]

  Regression potential is minimal, setting vm.watermark_boost_factor to
  0 by default restores the old kernel behavior before watermark
  boosting was introduced. In case of unexpected regressions we can
  always fix this in user-space via sysctl.

  [Original report]

  Hello, several times since upgrading to focal from 19.04 I've found my
  computer entirely unresponsive for periods of twenty or thirty
  seconds. No mouse movement, no keyboard input, the screen output does
  not change.

  My computer was using swap space and despite very slow writeout speeds
  well below what the NVME drive can handle, the computer was unusable.

  I've captured some vmstat 1 output and top output that I started
  collecting during the event. (Normally one very long painful period is
  followed by several shorter periods of uselessness.)

  Thanks

  ProblemType: Bug
  DistroRelease: Ubuntu 20.04
  Package: linux-image-5.4.0-12-generic 5.4.0-12.15
  ProcVersionSignature: Ubuntu 5.4.0-12.15-generic 5.4.8
  Uname: Linux 5.4.0-12-generic x86_64
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  ApportVersion: 2.20.11-0ubuntu15
  Architecture: amd64
  Date: Wed Jan 29 23:44:05 2020
  ProcEnviron:
   TERM=rxvt-unicode-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=<set>
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  SourcePackage: linux-signed-5.4
  UpgradeStatus: Upgraded to focal on 2020-01-24 (5 days ago)
  ---
  ProblemType: Bug
  AlsaVersion: Advanced Linux Sound Architecture Driver Version 
k5.4.0-12-generic.
  ApportVersion: 2.20.11-0ubuntu16
  Architecture: amd64
  AudioDevicesInUse:
   USER        PID ACCESS COMMAND
   /dev/snd/controlC0:  sarnold    2734 F.... pulseaudio
   /dev/snd/controlC1:  sarnold    2734 F.... pulseaudio
  Card0.Amixer.info:
   Card hw:0 'PCH'/'HDA Intel PCH at 0x2fe1028000 irq 145'
     Mixer name : 'Realtek ALC285'
     Components : 'HDA:10ec0285,17aa225c,00100002 
HDA:8086280b,80860101,00100000'
     Controls      : 53
     Simple ctrls  : 15
  Card1.Amixer.info:
   Card hw:1 'Audio'/'Generic ThinkPad Dock USB Audio at 
usb-0000:00:14.0-4.2.4, high speed'
     Mixer name : 'USB Mixer'
     Components : 'USB17ef:306f'
     Controls      : 9
     Simple ctrls  : 4
  DistroRelease: Ubuntu 20.04
  HibernationDevice: RESUME=none
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
  MachineType: LENOVO 20KHCTO1WW
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  Package: linux (not installed)
  ProcEnviron:
   TERM=rxvt-unicode-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=<set>
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 i915drmfb
  ProcKernelCmdLine: BOOT_IMAGE=/BOOT/ubuntu@/vmlinuz-5.4.0-12-generic 
root=ZFS=rpool/ROOT/ubuntu ro root=ZFS=rpool/ROOT/ubuntu quiet splash 
acpi_osi=! "acpi_osi=Windows 2015" vt.handoff=1
  ProcVersionSignature: Ubuntu 5.4.0-12.15-generic 5.4.8
  RelatedPackageVersions:
   linux-restricted-modules-5.4.0-12-generic N/A
   linux-backports-modules-5.4.0-12-generic  N/A
   linux-firmware                            1.185
  Tags:  focal
  Uname: Linux 5.4.0-12-generic x86_64
  UpgradeStatus: Upgraded to focal on 2020-01-24 (5 days ago)
  UserGroups: adm cdrom libvirt lpadmin plugdev sambashare sbuild sudo
  _MarkForUpload: True
  dmi.bios.date: 11/25/2019
  dmi.bios.vendor: LENOVO
  dmi.bios.version: N23ET69W (1.44 )
  dmi.board.asset.tag: Not Available
  dmi.board.name: 20KHCTO1WW
  dmi.board.vendor: LENOVO
  dmi.board.version: SDK0J40709 WIN
  dmi.chassis.asset.tag: No Asset Information
  dmi.chassis.type: 10
  dmi.chassis.vendor: LENOVO
  dmi.chassis.version: None
  dmi.modalias: 
dmi:bvnLENOVO:bvrN23ET69W(1.44):bd11/25/2019:svnLENOVO:pn20KHCTO1WW:pvrThinkPadX1Carbon6th:rvnLENOVO:rn20KHCTO1WW:rvrSDK0J40709WIN:cvnLENOVO:ct10:cvrNone:
  dmi.product.family: ThinkPad X1 Carbon 6th
  dmi.product.name: 20KHCTO1WW
  dmi.product.sku: LENOVO_MT_20KH_BU_Think_FM_ThinkPad X1 Carbon 6th
  dmi.product.version: ThinkPad X1 Carbon 6th
  dmi.sys.vendor: LENOVO
  ---
  ProblemType: Bug
  AlsaVersion: Advanced Linux Sound Architecture Driver Version 
k5.4.0-12-generic.
  ApportVersion: 2.20.11-0ubuntu16
  Architecture: amd64
  AudioDevicesInUse:
   USER        PID ACCESS COMMAND
   /dev/snd/controlC0:  sarnold    2734 F.... pulseaudio
   /dev/snd/controlC1:  sarnold    2734 F.... pulseaudio
  Card0.Amixer.info:
   Card hw:0 'PCH'/'HDA Intel PCH at 0x2fe1028000 irq 145'
     Mixer name : 'Realtek ALC285'
     Components : 'HDA:10ec0285,17aa225c,00100002 
HDA:8086280b,80860101,00100000'
     Controls      : 53
     Simple ctrls  : 15
  Card1.Amixer.info:
   Card hw:1 'Audio'/'Generic ThinkPad Dock USB Audio at 
usb-0000:00:14.0-4.2.4, high speed'
     Mixer name : 'USB Mixer'
     Components : 'USB17ef:306f'
     Controls      : 9
     Simple ctrls  : 4
  DistroRelease: Ubuntu 20.04
  HibernationDevice: RESUME=none
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
  MachineType: LENOVO 20KHCTO1WW
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  Package: linux (not installed)
  ProcEnviron:
   TERM=rxvt-unicode-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=<set>
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 i915drmfb
  ProcKernelCmdLine: BOOT_IMAGE=/BOOT/ubuntu@/vmlinuz-5.4.0-12-generic 
root=ZFS=rpool/ROOT/ubuntu ro root=ZFS=rpool/ROOT/ubuntu quiet splash 
acpi_osi=! "acpi_osi=Windows 2015" vt.handoff=1
  ProcVersionSignature: Ubuntu 5.4.0-12.15-generic 5.4.8
  RelatedPackageVersions:
   linux-restricted-modules-5.4.0-12-generic N/A
   linux-backports-modules-5.4.0-12-generic  N/A
   linux-firmware                            1.185
  Tags:  focal
  Uname: Linux 5.4.0-12-generic x86_64
  UpgradeStatus: Upgraded to focal on 2020-01-24 (5 days ago)
  UserGroups: adm cdrom libvirt lpadmin plugdev sambashare sbuild sudo
  _MarkForUpload: True
  dmi.bios.date: 11/25/2019
  dmi.bios.vendor: LENOVO
  dmi.bios.version: N23ET69W (1.44 )
  dmi.board.asset.tag: Not Available
  dmi.board.name: 20KHCTO1WW
  dmi.board.vendor: LENOVO
  dmi.board.version: SDK0J40709 WIN
  dmi.chassis.asset.tag: No Asset Information
  dmi.chassis.type: 10
  dmi.chassis.vendor: LENOVO
  dmi.chassis.version: None
  dmi.modalias: 
dmi:bvnLENOVO:bvrN23ET69W(1.44):bd11/25/2019:svnLENOVO:pn20KHCTO1WW:pvrThinkPadX1Carbon6th:rvnLENOVO:rn20KHCTO1WW:rvrSDK0J40709WIN:cvnLENOVO:ct10:cvrNone:
  dmi.product.family: ThinkPad X1 Carbon 6th
  dmi.product.name: 20KHCTO1WW
  dmi.product.sku: LENOVO_MT_20KH_BU_Think_FM_ThinkPad X1 Carbon 6th
  dmi.product.version: ThinkPad X1 Carbon 6th
  dmi.sys.vendor: LENOVO

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1861359/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to