We are still looking into this issue. While we can reproduce the test case and see difference in the performance, the delta is not as significant and our results have not very consistent. I'm taking the approach of setting up a more comprehensive test environment to run more tests faster. Hopefully we can then go through an analysis and bisect process with more meaningful results.
-- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2042564 Title: Performance regression in the 5.15 Ubuntu 20.04 kernel compared to 5.4 Ubuntu 20.04 kernel Status in linux package in Ubuntu: New Status in linux source package in Focal: New Bug description: We in the Canonical Public Cloud team have received report from our colleagues in Google regarding a potential performance regression with the 5.15 kernel vs the 5.4 kernel on ubuntu 20.04. Their test were performed using the linux-gkeop and linux-gkeop-5.15 kernels. I have verified with the generic Ubuntu 20.04 5.4 linux-generic and the Ubuntu 20.04 5.15 linux-generic-hwe-20.04 kernels. The tests were run using `fio` fio commands: * 4k initwrite: `fio --ioengine=libaio --blocksize=4k --readwrite=write --filesize=40G --end_fsync=1 --iodepth=128 --direct=1 --group_reporting --numjobs=8 --name=fiojob1 --filename=/dev/sdc` * 4k overwrite: `fio --ioengine=libaio --blocksize=4k --readwrite=write --filesize=40G --end_fsync=1 --iodepth=128 --direct=1 --group_reporting --numjobs=8 --name=fiojob1 --filename=/dev/sdc` My reproducer was to launch an Ubuntu 20.04 cloud image locally with qemu the results are below: Using 5.4 kernel ``` ubuntu@cloudimg:~$ uname --kernel-release 5.4.0-164-generic ubuntu@cloudimg:~$ sudo fio --ioengine=libaio --blocksize=4k --readwrite=write --filesize=40G --end_fsync=1 --iodepth=128 --direct=1 --group_reporting --numjobs=8 --name=fiojob1 --filename=/dev/sda fiojob1: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=128 ... fio-3.16 Starting 8 processes Jobs: 8 (f=8): [W(8)][99.6%][w=925MiB/s][w=237k IOPS][eta 00m:01s] fiojob1: (groupid=0, jobs=8): err= 0: pid=2443: Thu Nov 2 09:15:22 2023 write: IOPS=317k, BW=1237MiB/s (1297MB/s)(320GiB/264837msec); 0 zone resets slat (nsec): min=628, max=37820k, avg=7207.71, stdev=101058.61 clat (nsec): min=457, max=56099k, avg=3222240.45, stdev=1707823.38 lat (usec): min=23, max=56100, avg=3229.78, stdev=1705.80 clat percentiles (usec): | 1.00th=[ 775], 5.00th=[ 1352], 10.00th=[ 1647], 20.00th=[ 2024], | 30.00th=[ 2343], 40.00th=[ 2638], 50.00th=[ 2933], 60.00th=[ 3261], | 70.00th=[ 3654], 80.00th=[ 4146], 90.00th=[ 5014], 95.00th=[ 5932], | 99.00th=[ 8979], 99.50th=[10945], 99.90th=[18220], 99.95th=[22676], | 99.99th=[32113] bw ( MiB/s): min= 524, max= 1665, per=100.00%, avg=1237.72, stdev=20.42, samples=4232 iops : min=134308, max=426326, avg=316855.16, stdev=5227.36, samples=4232 lat (nsec) : 500=0.01%, 750=0.01%, 1000=0.01% lat (usec) : 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%, 100=0.01% lat (usec) : 250=0.05%, 500=0.54%, 750=0.37%, 1000=0.93% lat (msec) : 2=17.40%, 4=58.02%, 10=22.01%, 20=0.60%, 50=0.07% lat (msec) : 100=0.01% cpu : usr=3.29%, sys=7.45%, ctx=1262621, majf=0, minf=103 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1% issued rwts: total=0,83886080,0,8 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=128 Run status group 0 (all jobs): WRITE: bw=1237MiB/s (1297MB/s), 1237MiB/s-1237MiB/s (1297MB/s-1297MB/s), io=320GiB (344GB), run=264837-264837msec Disk stats (read/write): sda: ios=36/32868891, merge=0/50979424, ticks=5/27498602, in_queue=1183124, util=100.00% ``` After upgrading to linux-generic-hwe-20.04 kernel and rebooting ``` ubuntu@cloudimg:~$ uname --kernel-release 5.15.0-88-generic ubuntu@cloudimg:~$ sudo fio --ioengine=libaio --blocksize=4k --readwrite=write --filesize=40G --end_fsync=1 --iodepth=128 --direct=1 --group_reporting --numjobs=8 --name=fiojob1 --filename=/dev/sda fiojob1: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=128 ... fio-3.16 Starting 8 processes Jobs: 1 (f=1): [_(7),W(1)][100.0%][w=410MiB/s][w=105k IOPS][eta 00m:00s] fiojob1: (groupid=0, jobs=8): err= 0: pid=1438: Thu Nov 2 09:46:49 2023 write: IOPS=155k, BW=605MiB/s (634MB/s)(320GiB/541949msec); 0 zone resets slat (nsec): min=660, max=325426k, avg=10351.04, stdev=232438.50 clat (nsec): min=1100, max=782743k, avg=6595008.67, stdev=6290570.04 lat (usec): min=86, max=782748, avg=6606.08, stdev=6294.03 clat percentiles (usec): | 1.00th=[ 914], 5.00th=[ 2180], 10.00th=[ 2802], 20.00th=[ 3556], | 30.00th=[ 4178], 40.00th=[ 4817], 50.00th=[ 5538], 60.00th=[ 6259], | 70.00th=[ 7177], 80.00th=[ 8455], 90.00th=[ 10683], 95.00th=[ 13566], | 99.00th=[ 26870], 99.50th=[ 34866], 99.90th=[ 63177], 99.95th=[ 80217], | 99.99th=[145753] bw ( KiB/s): min=39968, max=1683451, per=100.00%, avg=619292.10, stdev=26377.19, samples=8656 iops : min= 9990, max=420862, avg=154822.58, stdev=6594.34, samples=8656 lat (usec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01% lat (usec) : 100=0.01%, 250=0.01%, 500=0.05%, 750=0.48%, 1000=0.65% lat (msec) : 2=2.79%, 4=23.00%, 10=60.93%, 20=10.08%, 50=1.83% lat (msec) : 100=0.16%, 250=0.02%, 500=0.01%, 1000=0.01% cpu : usr=3.27%, sys=7.39%, ctx=1011754, majf=0, minf=93 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1% issued rwts: total=0,83886080,0,8 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=128 Run status group 0 (all jobs): WRITE: bw=605MiB/s (634MB/s), 605MiB/s-605MiB/s (634MB/s-634MB/s), io=320GiB (344GB), run=541949-541949msec Disk stats (read/write): sda: ios=264/31713991, merge=0/52167896, ticks=127/57278442, in_queue=57278609, util=99.95% ``` I have shared the results with xnox and the important datapoints to see are `bw=1237MiB/s` with the 5.4 kernel and only `bw=605MiB/s` with the 5.15 kernel. Attached find the test results initially reported by our google colleagues To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2042564/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp