apport information

** Tags added: apport-collected focal uec-images

** Description changed:

  I suspect this is a kernel bug.
  
  With ubuntu <= 21, I find that this runs in about 13 seconds:
  
  python3 -c "import timeit; print(timeit.Timer('for _ in range(0,1000):
  pass').timeit())"
  
  With ubuntu >= 22, I find that it runs in about 83 seconds.
  
  The problem seems to be specific to Cisco UCS hardware and can be mostly
  mitigated by disabling hyperthreading.
  
  I also tried counting to a million a thousand times instead of counting
  to 1000 a million times (this is how many times timeit runs the
  experiment), just in case the time-measuring was the slow part, but it
  made no difference.  Even just a straight up loop without using timeit
  shows about the same difference.
  
  Originally, I encountered this when upgrading from 18 to 24.  We went
  back and isolated the problem to something that changed between 21 and
  22.  The version I actually care about is 24.
  
  The only Cisco UCS systems we have are a bunch of Cisco UCS C220 M5SX
  rack servers and a bunch of Cisco UCS B200 M5 blades.  All of them show
  the regression.  I can confirm that on a variety of similarly-specced
  supermicro systems, the regression does not occur.
  
  The problem can be easily reproduced by booting off
  https://releases.ubuntu.com/24.04.1/ubuntu-24.04.1-live-server-amd64.iso
  (or various other versions) and dropping into a shell.  The installer
  kernel behaves the same as the installed kernel across the various
  versions.  So it should be possible for anyone with this hardware to
  reproduce the issue by using the installer shells.  You may wish to use
  an old python3 from a version-pinned docker image to get an apples-to-
  apples comparison.
  
  If I run the experiment inside ubuntu18 containers on ubuntu21 and
  ubuntu22 I can see that I still get the dramatically different runtimes.
  i.e., the kernel version and not the userland or python version is what
  seems to matter.
  
  We have tried mitigations=off with no effect.
  
  We have tried reverting various kernel scheduler configuration changes
  back to their ubuntu21 settings with no effect.
  
  We have tried disabling hyperthreading in the BIOS.  This had an
  enormous effect.  It reduces runtime from 83 seconds to 17 seconds.  17
  is still 30% slower than 13, but it is obviously way better than 83.
  
  So just to recap:
  13s: ubuntu21 with hyperthreading on
  83s: ubuntu22 with hyperthreading on
  17s: ubuntu22 with hyperthreading off
  
  This machine has 2 sockets with 20 physical cores each for a total of 80
  logical cores once we account for hyperthreading.
  
  Ideally I would prefer not to be forced to disable hyperthreading.  Even
  if that is not possible, I am interested in avoiding the remaining 30%
  slowdown.
  
  sysbench --test=cpu and sysbench --test=memory also both exhibit a
  slowdown, but it is more like a 30% slowdown instead of 800%, even with
  hyperthreading turned on.
  
  I have used perf to profile python and found the time was spread out--
  did not see any particular smoking gun.  The python process makes < 300
  syscalls over its entire lifetime and virtually no context switches.  I
  tried running it with realtime priority with affinity for a single core,
  which seemed to make little difference.  The python process uses 100% of
  a cpu as it runs.
  
  Any ideas?
+ --- 
+ ProblemType: Bug
+ AlsaDevices:
+  total 0
+  crw-rw----+ 1 root audio 116,  1 Oct  2 19:29 seq
+  crw-rw----+ 1 root audio 116, 33 Oct  2 19:29 timer
+ AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
+ ApportVersion: 2.20.11-0ubuntu27.26
+ Architecture: amd64
+ ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
+ AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
+ CasperMD5CheckResult: pass
+ DistroRelease: Ubuntu 20.04
+ InstallationDate: Installed on 2024-10-01 (0 days ago)
+ InstallationMedia: Ubuntu-Server 20.04.6 LTS "Focal Fossa" - Release amd64 
(20230314.1)
+ IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
+ Lsusb:
+  Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
+  Bus 001 Device 003: ID 04b4:6570 Cypress Semiconductor Corp. Unprogrammed 
CY7C65632/34 hub HX2VL
+  Bus 001 Device 004: ID 0624:0402 Avocent Corp. Cisco Virtual Keyboard and 
Mouse
+  Bus 001 Device 002: ID 05a6:0a00 Cisco Systems, Inc. Integrated Management 
Controller Hub
+  Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
+ MachineType: Cisco Systems Inc UCSC-C220-M5SX
+ Package: linux (not installed)
+ PciMultimedia:
+  
+ ProcEnviron:
+  TERM=linux
+  PATH=(custom, no user)
+  LANG=en_US.UTF-8
+  SHELL=/bin/bash
+ ProcFB: 0 mgag200drmfb
+ ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.15.0-122-generic 
root=/dev/mapper/ubuntu--vg-ubuntu--lv ro
+ ProcVersionSignature: Ubuntu 5.15.0-122.132~20.04.1-generic 5.15.163
+ RelatedPackageVersions:
+  linux-restricted-modules-5.15.0-122-generic N/A
+  linux-backports-modules-5.15.0-122-generic  N/A
+  linux-firmware                              1.187.39
+ RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
+ Tags:  focal uec-images
+ Uname: Linux 5.15.0-122-generic x86_64
+ UnreportableReason: This report is about a package that is not installed.
+ UpgradeStatus: No upgrade log present (probably fresh install)
+ UserGroups: N/A
+ _MarkForUpload: False
+ dmi.bios.date: 09/17/2020
+ dmi.bios.release: 5.14
+ dmi.bios.vendor: Cisco Systems, Inc.
+ dmi.bios.version: C220M5.4.1.2b.0.0917201934
+ dmi.board.asset.tag: GT02175
+ dmi.board.name: UCSC-C220-M5SX
+ dmi.board.vendor: Cisco Systems Inc
+ dmi.board.version: 74-105772-01
+ dmi.chassis.asset.tag: GT02175
+ dmi.chassis.type: 23
+ dmi.chassis.vendor: Cisco Systems Inc
+ dmi.chassis.version: 74-105774-03
+ dmi.modalias: 
dmi:bvnCiscoSystems,Inc.:bvrC220M5.4.1.2b.0.0917201934:bd09/17/2020:br5.14:svnCiscoSystemsInc:pnUCSC-C220-M5SX:pvrA0:rvnCiscoSystemsInc:rnUCSC-C220-M5SX:rvr74-105772-01:cvnCiscoSystemsInc:ct23:cvr74-105774-03:sku:
+ dmi.product.name: UCSC-C220-M5SX
+ dmi.product.version: A0
+ dmi.sys.vendor: Cisco Systems Inc

** Attachment added: "CRDA.txt"
   https://bugs.launchpad.net/bugs/2083077/+attachment/5824229/+files/CRDA.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2083077

Title:
  python3 counting 6x slowdown with ubuntu22 on cisco ucs hardware with
  hyperthreading

Status in linux package in Ubuntu:
  New

Bug description:
  I suspect this is a kernel bug.

  With ubuntu <= 21, I find that this runs in about 13 seconds:

  python3 -c "import timeit; print(timeit.Timer('for _ in range(0,1000):
  pass').timeit())"

  With ubuntu >= 22, I find that it runs in about 83 seconds.

  The problem seems to be specific to Cisco UCS hardware and can be
  mostly mitigated by disabling hyperthreading.

  I also tried counting to a million a thousand times instead of
  counting to 1000 a million times (this is how many times timeit runs
  the experiment), just in case the time-measuring was the slow part,
  but it made no difference.  Even just a straight up loop without using
  timeit shows about the same difference.

  Originally, I encountered this when upgrading from 18 to 24.  We went
  back and isolated the problem to something that changed between 21 and
  22.  The version I actually care about is 24.

  The only Cisco UCS systems we have are a bunch of Cisco UCS C220 M5SX
  rack servers and a bunch of Cisco UCS B200 M5 blades.  All of them
  show the regression.  I can confirm that on a variety of similarly-
  specced supermicro systems, the regression does not occur.

  The problem can be easily reproduced by booting off
  https://releases.ubuntu.com/24.04.1/ubuntu-24.04.1-live-server-
  amd64.iso (or various other versions) and dropping into a shell.  The
  installer kernel behaves the same as the installed kernel across the
  various versions.  So it should be possible for anyone with this
  hardware to reproduce the issue by using the installer shells.  You
  may wish to use an old python3 from a version-pinned docker image to
  get an apples-to-apples comparison.

  If I run the experiment inside ubuntu18 containers on ubuntu21 and
  ubuntu22 I can see that I still get the dramatically different
  runtimes.  i.e., the kernel version and not the userland or python
  version is what seems to matter.

  We have tried mitigations=off with no effect.

  We have tried reverting various kernel scheduler configuration changes
  back to their ubuntu21 settings with no effect.

  We have tried disabling hyperthreading in the BIOS.  This had an
  enormous effect.  It reduces runtime from 83 seconds to 17 seconds.
  17 is still 30% slower than 13, but it is obviously way better than
  83.

  So just to recap:
  13s: ubuntu21 with hyperthreading on
  83s: ubuntu22 with hyperthreading on
  17s: ubuntu22 with hyperthreading off

  This machine has 2 sockets with 20 physical cores each for a total of
  80 logical cores once we account for hyperthreading.

  Ideally I would prefer not to be forced to disable hyperthreading.
  Even if that is not possible, I am interested in avoiding the
  remaining 30% slowdown.

  sysbench --test=cpu and sysbench --test=memory also both exhibit a
  slowdown, but it is more like a 30% slowdown instead of 800%, even
  with hyperthreading turned on.

  I have used perf to profile python and found the time was spread out--
  did not see any particular smoking gun.  The python process makes <
  300 syscalls over its entire lifetime and virtually no context
  switches.  I tried running it with realtime priority with affinity for
  a single core, which seemed to make little difference.  The python
  process uses 100% of a cpu as it runs.

  Any ideas?
  --- 
  ProblemType: Bug
  AlsaDevices:
   total 0
   crw-rw----+ 1 root audio 116,  1 Oct  2 19:29 seq
   crw-rw----+ 1 root audio 116, 33 Oct  2 19:29 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.20.11-0ubuntu27.26
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  CasperMD5CheckResult: pass
  DistroRelease: Ubuntu 20.04
  InstallationDate: Installed on 2024-10-01 (0 days ago)
  InstallationMedia: Ubuntu-Server 20.04.6 LTS "Focal Fossa" - Release amd64 
(20230314.1)
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  Lsusb:
   Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
   Bus 001 Device 003: ID 04b4:6570 Cypress Semiconductor Corp. Unprogrammed 
CY7C65632/34 hub HX2VL
   Bus 001 Device 004: ID 0624:0402 Avocent Corp. Cisco Virtual Keyboard and 
Mouse
   Bus 001 Device 002: ID 05a6:0a00 Cisco Systems, Inc. Integrated Management 
Controller Hub
   Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
  MachineType: Cisco Systems Inc UCSC-C220-M5SX
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=linux
   PATH=(custom, no user)
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 mgag200drmfb
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.15.0-122-generic 
root=/dev/mapper/ubuntu--vg-ubuntu--lv ro
  ProcVersionSignature: Ubuntu 5.15.0-122.132~20.04.1-generic 5.15.163
  RelatedPackageVersions:
   linux-restricted-modules-5.15.0-122-generic N/A
   linux-backports-modules-5.15.0-122-generic  N/A
   linux-firmware                              1.187.39
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  Tags:  focal uec-images
  Uname: Linux 5.15.0-122-generic x86_64
  UnreportableReason: This report is about a package that is not installed.
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: N/A
  _MarkForUpload: False
  dmi.bios.date: 09/17/2020
  dmi.bios.release: 5.14
  dmi.bios.vendor: Cisco Systems, Inc.
  dmi.bios.version: C220M5.4.1.2b.0.0917201934
  dmi.board.asset.tag: GT02175
  dmi.board.name: UCSC-C220-M5SX
  dmi.board.vendor: Cisco Systems Inc
  dmi.board.version: 74-105772-01
  dmi.chassis.asset.tag: GT02175
  dmi.chassis.type: 23
  dmi.chassis.vendor: Cisco Systems Inc
  dmi.chassis.version: 74-105774-03
  dmi.modalias: 
dmi:bvnCiscoSystems,Inc.:bvrC220M5.4.1.2b.0.0917201934:bd09/17/2020:br5.14:svnCiscoSystemsInc:pnUCSC-C220-M5SX:pvrA0:rvnCiscoSystemsInc:rnUCSC-C220-M5SX:rvr74-105772-01:cvnCiscoSystemsInc:ct23:cvr74-105774-03:sku:
  dmi.product.name: UCSC-C220-M5SX
  dmi.product.version: A0
  dmi.sys.vendor: Cisco Systems Inc

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2083077/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to