------- Comment From vpuli...@in.ibm.com 2024-10-24 10:01 EDT------- (In reply to comment #14) > Vijay, Can you verify this bug ?
Hi, I ran pgbenach 5 times on Ubuntu 24.04 L2 Guest(kernel : 6.8.0-47-generic) with CEDE On and CEDE Off. I could see upto 50% degradation on L2-Guest CEDE ON vs L2-Guest CEDE OFF. CEDE On Results : PostgreSQL 17: pts/pgbench-1.15.0 [Scaling Factor: 100 - Clients: 50 - Mode: Read Only] Test 1 of 1 Estimated Trial Run Count: 5 Estimated Time To Completion: 13 Minutes [11:38 UTC] Started Run 1 @ 11:26:09 Started Run 2 @ 11:28:45 Started Run 3 @ 11:31:15 Started Run 4 @ 11:33:45 Started Run 5 @ 11:36:19 Scaling Factor: 100 - Clients: 50 - Mode: Read Only: 1188.782604 1240.626583 1295.409026 1284.232445 1194.451742 Average: 1241 TPS Deviation: 3.97% Samples: 5 Scaling Factor: 100 - Clients: 50 - Mode: Read Only - Average Latency: 6.73 6.448 6.176 6.229 6.698 Average: 6.456 ms Deviation: 3.98% Samples: 5 CEDE OFF Results : PostgreSQL 17: pts/pgbench-1.15.0 [Scaling Factor: 100 - Clients: 50 - Mode: Read Only] Test 1 of 1 Estimated Trial Run Count: 5 Estimated Time To Completion: 13 Minutes [12:07 UTC] Started Run 1 @ 11:55:05 Started Run 2 @ 11:57:38 Started Run 3 @ 12:00:14 Started Run 4 @ 12:02:45 Started Run 5 @ 12:05:15 Scaling Factor: 100 - Clients: 50 - Mode: Read Only: 2746.538132 2321.285623 2610.106716 2523.755346 2608.03977 Average: 2562 TPS Deviation: 6.11% Samples: 5 Scaling Factor: 100 - Clients: 50 - Mode: Read Only - Average Latency: 2.913 3.446 3.065 3.17 3.067 Average: 3.132 ms Deviation: 6.32% Samples: 5 ------- Comment From vpuli...@in.ibm.com 2024-10-24 10:02 EDT------- Config of KVM Host and L2-Guest: KVM Host Config : root@kvmfvt:~# cpupower idle-info CPUidle driver: pseries_idle CPUidle governor: menu analyzing CPU 1: Number of idle states: 2 Available idle states: snooze CEDE snooze: Flags/Description: snooze Latency: 0 Usage: 584282 Duration: 15772032 CEDE: Flags/Description: CEDE Latency: 12 Usage: 1693009 Duration: 569004764428 root@kvmfvt:~# numactl -H available: 1 nodes (1) node 1 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 node 1 size: 49921 MB node 1 free: 859 MB node distances: node 1 1: 10 root@kvmfvt:~# cat /proc/cmdline BOOT_IMAGE=/boot/vmlinux-6.8.0-48-generic root=UUID=b25a666f-351f-4cb3-ae86-0c86f35ab749 ro crashkernel=2048M crashkernel=2G-4G:320M,4G-32G:512M,32G-64G:1024M,64G-128G:2048M,128G-:4096M root@kvmfvt:~# ################### L2-Guest CEDE ON Config : root@kvmfvt:~/phoronix-test-suite cat /etc/os-release PRETTY_NAME="Ubuntu 24.04.1 LTS" NAME="Ubuntu" VERSION_ID="24.04" VERSION="24.04.1 LTS (Noble Numbat)" VERSION_CODENAME=noble ID=ubuntu ID_LIKE=debian HOME_URL="https://www.ubuntu.com/" SUPPORT_URL="https://help.ubuntu.com/" BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" UBUNTU_CODENAME=noble LOGO=ubuntu-logo root@kvmfvt:~/phoronix-test-suite cpupower idle-info CPUidle driver: pseries_idle CPUidle governor: menu analyzing CPU 4: Number of idle states: 2 Available idle states: snooze Shared Cede snooze: Flags/Description: snooze Latency: 0 Usage: 73705 Duration: 4272838 Shared Cede: Flags/Description: Shared Cede Latency: 10 Usage: 3667349 Duration: 570028915040 root@kvmfvt:~/phoronix-test-suite uname -a Linux kvmfvt 6.8.0-47-generic #47-Ubuntu SMP Fri Sep 27 21:38:55 UTC 2024 ppc64le ppc64le ppc64le GNU/Linux root@kvmfvt:~/phoronix-test-suite numactl -H available: 1 nodes (0) node 0 cpus: 0 1 2 3 4 5 6 7 node 0 size: 48342 MB node 0 free: 25566 MB node distances: node 0 0: 10 root@kvmfvt:~/phoronix-test-suite cat /proc/cmdline BOOT_IMAGE=/vmlinux-6.8.0-47-generic root=/dev/mapper/ubuntu--vg-ubuntu--lv ro crashkernel=2048M crashkernel=2G-4G:320M,4G-32G:512M,32G-64G:1024M,64G-128G:2048M,128G-:4096M L2-Guest CEDE OFF Config : root@kvmfvt:~/phoronix-test-suite cpupower idle-info CPUidle driver: pseries_idle CPUidle governor: menu analyzing CPU 4: Number of idle states: 2 Available idle states: snooze Shared Cede snooze: Flags/Description: snooze Latency: 0 Usage: 641899 Duration: 61252357 Shared Cede (DISABLED) : Flags/Description: Shared Cede Latency: 10 Usage: 3667645 Duration: 570100106856 root@kvmfvt:~/phoronix-test-suite cat /proc/cmdline BOOT_IMAGE=/vmlinux-6.8.0-47-generic root=/dev/mapper/ubuntu--vg-ubuntu--lv ro crashkernel=2048M crashkernel=2G-4G:320M,4G-32G:512M,32G-64G:1024M,64G-128G:2048M,128G-:4096M root@kvmfvt:~/phoronix-test-suite numactl -H available: 1 nodes (0) node 0 cpus: 0 1 2 3 4 5 6 7 node 0 size: 48342 MB node 0 free: 25565 MB node distances: node 0 0: 10 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2070253 Title: KVM on PowerVM: L2 Guest-Aggressively entering CEDE results in low performance. Possible tuning opportunity. Status in The Ubuntu-power-systems project: Fix Committed Status in linux package in Ubuntu: Fix Released Status in linux source package in Noble: Fix Committed Status in linux source package in Oracular: Fix Released Bug description: KVM on PowerVM: L2 Guest-Aggressively entering CEDE results in low performance. Possible tuning opportunity. ---uname output--- Linux rhel86edb1 #1 SMP Sun Jan 21 11:45:44 EST 2024 ppc64le ppc64le ppc64le GNU/Linux ---Steps to Reproduce--- Example: run READ only Test using EDB-PGBENCH and DT7 workloads on 1. L1-Host 2. L2-Guest CEDE ON 3. L2-Guest CEDE OFF significant performance drop is observed in L2-Guest CEDE on vs L2-Guest CEDE off case. Note: Host and Guest configuration used performance experiments are listed below. Location of EDB-PGBENCH: #wget http://ci-http-results.aus.stglabs.ibm.com/perfTest/scripts/Bug_Scripts/pgbench_install.sh #chmod 777 pgbench_install.sh #./pgbench_install.sh -->> it will install EDB(pgbench) and run edb on target lpar. Location of DT7 workload: #wget http://ci-http-results.aus.stglabs.ibm.com/perfTest/scripts/Bug_Scripts/DT7-Install.sh #chmod 777 DT7-Install.sh #./DT7-Install.sh -->> It will install DT7. Sample Commands : Once installation was successful run below commands on target lpar. EDB-PGBENCH Commands : # su - enterprisedb # vi t1.tc -->> copy below lines to t1.tc file . ##########t1.tc########## runname=select SCALE=100 runtime=300 thread="40" smtlist="8" mode=select recreateinstance=yes recreateduringrun=yes warmup=no perf_stat=yes PGSQL=/usr/local/pgsql/bin #PGSQL=/usr/edb/as14/bin #PGPORT=5432 cores=5 ##########t1.tc########## #cp t1.tc tc/ #./auto-run-test.sh DT7 Commands : After installation of DT7 run below command : #cd /root #./DayTrader7_Run.sh -u 20 -l 900 -i 2 ###################################################################### Machine Type: Power 10 LPAR (RHEL9.3) gcc : 11.4.1 Memory : 300GB Test type : pgbench-edb, DT7 ###################################################################### KVM Host lscpu output : # lscpu Architecture: ppc64le Byte Order: Little Endian CPU(s): 96 On-line CPU(s) list: 0-39 Off-line CPU(s) list: 40-95 Model name: POWER10 (architected), altivec supported Model: 2.0 (pvr 0080 0200) Thread(s) per core: 8 Core(s) per socket: 5 Socket(s): 1 Physical sockets: 1 Physical chips: 4 Physical cores/chip: 12 Virtualization features: Hypervisor vendor: pHyp Virtualization type: para Caches (sum of all): L1d: 320 KiB (10 instances) L1i: 480 KiB (10 instances) L2: 10 MiB (10 instances) L3: 40 MiB (10 instances) NUMA: NUMA node(s): 1 NUMA node2 CPU(s): 0-39 Vulnerabilities: Gather data sampling: Not affected Itlb multihit: Not affected L1tf: Not affected Mds: Not affected Meltdown: Not affected Mmio stale data: Not affected Retbleed: Not affected Spec rstack overflow: Not affected Spec store bypass: Not affected Spectre v1: Vulnerable, ori31 speculation barrier enabled Spectre v2: Vulnerable Srbds: Not affected Tsx async abort: Not affected ############################################## KVM on PowerVM setup: KVM (Kernel Virtual Machine) is a virtualization module for Linux that provides the ability of virtualization to Linux i.e. it allows the kernel to function as a hypervisor. We used P10 2S4U system for this experiment. Workloads: DT7 and PGBENCH in details: DT7 is an open source benchmark application emulating an online stock trading system. DT7 consist of 3 components 1) Jmeter 2) WAS (WebSphere Application Server) 3) DB2 DayTrader benchmark/application will be installed/deployed on WAS and this used DB2 as a backbone database. Jmeter generate the request and interact with the WAS. which would be kind of middle ware. PGBENCH : pgbench is a simple program for running benchmark tests on PostgreSQL. It runs the same sequence of SQL commands over and over, possibly in multiple concurrent database sessions, and then calculates the average transaction rate (transactions per second). Config of KVM Host and L2-Guest: KVM Host Config : # uname -a Linux #1 SMP Sun Jan 21 11:45:44 EST 2024 ppc64le ppc64le ppc64le GNU/Linux # numactl -H available: 1 nodes (1) node 1 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 node 1 size: 292860 MB node 1 free: 290979 MB node distances: node 1 1: 10 # cat /proc/cmdline BOOT_IMAGE=(ieee1275//pci@800000020000021/pci1014\\,683@0/namespace@1,msdos2)/vmlinuz-6.7.0-nested.1.1a946fcde971.up.ibm.el9.ppc64le root=/dev/mapper/rhel_rhel86edb-root ro crashkernel=2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G rd.lvm.lv=rhel_rhel86edb/root rd.lvm.lv=rhel_rhel86edb/swap biosdevname=0 mitigations=off doorbell=off # ppc64_cpu --dscr DSCR is 23 # cpupower idle-info CPUidle driver: pseries_idle CPUidle governor: menu analyzing CPU 0: Number of idle states: 2 Available idle states: snooze CEDE snooze: Flags/Description: snooze Latency: 0 Usage: 2656 Duration: 297483 CEDE: Flags/Description: CEDE Latency: 12 Usage: 159981 Duration: 95235883853 # qemu-system-ppc64 --version QEMU emulator version 7.1.0 Copyright (c) 2003-2022 Fabrice Bellard and the QEMU Project developers #Libvirt version : libvirt-8.7.0 L2 GUEST CONFIG : CPU's : UN-pinned # cat /proc/cmdline BOOT_IMAGE=(ieee1275/disk,msdos2)/vmlinuz-6.7.0-nested.1.1a946fcde971.up.ibm.el9.ppc64le root=/dev/mapper/rhel-root ro crashkernel=2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap mitigations=off doorbell=off # ppc64_cpu --dscr DSCR is 23 # cat /proc/cmdline BOOT_IMAGE=(ieee1275/disk,msdos2)/vmlinuz-6.7.0-nested.1.1a946fcde971.up.ibm.el9.ppc64le root=/dev/mapper/rhel-root ro crashkernel=2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap mitigations=off doorbell=off # numactl -H available: 1 nodes (0) node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 node 0 size: 106739 MB node 0 free: 105211 MB node distances: node 0 0: 10 We did DT7 and PGBENCH-Read only test on L2-Guest with CEDE On vs Off. We could see degradation with CEDE on compare with CEDE off. Here I?m adding DT7 and EDB-PGBENCH results. L2-GUEST 5Cores with CEDE on: 1) EDB-PGBENCH Data : + /usr/local/pgsql/bin/pgbench -n -S -T 120 -c 40 -j 40 pgbench pgbench (14.5) transaction type: <builtin: select only> scaling factor: 100 query mode: simple number of clients: 40 number of threads: 40 duration: 120 s number of transactions actually processed: 21811958 latency average = 0.220 ms initial connection time = 16.004 ms tps = 181761.468180 (without initial connection time) 2) DT7 Data: DayTrader7 Report Run Group ID=0 Run ID=40 Run Description=Test Run Host=127.0.0.1 Users=40 Run_time=900 Total Instances 2 Total Throughputs 2340.6 L2-GUEST 5Cores with CEDE Off: 1) EDB-PGBENCH Data : + /usr/local/pgsql/bin/pgbench -n -S -T 120 -c 40 -j 40 pgbench pgbench (14.5) transaction type: <builtin: select only> scaling factor: 100 query mode: simple number of clients: 40 number of threads: 40 duration: 120 s number of transactions actually processed: 37804765 latency average = 0.127 ms initial connection time = 5.910 ms tps = 315015.313022 (without initial connection time) 2) DT7 Results: ================================================================================== DayTrader7 Report Run Group ID=0 Run ID=41 Run Description=Test Run Host=127.0.0.1 Users=40 Run_time=900 Total Instances 2 Total Throughputs 3569.6 =================================================================================== EDB-PGBENCH Performance Summary: CEDE ON EDB-PGBENCH Data : 181761.46818 tps CEDE OFF EDB-PGBENCH Data : 315015.31302 tps Percentage Drop: (181761.46818-315015.31)*100/315015.3130= 42% Guest when CEDE was turned ON under-performed by 42% vs CEDE turned OFF. DT7 Performance Summary: CEDE ON DT7 Data : 2340.6 tps CEDE OFF DT7 Data : 3569.6 tps Percentage Drop : (2340.6-3569.6 )*100/3569.6= 34% Guest when CEDE was turned ON under-performed by 34% vs CEDE turned OFF. From above data we observed that performance drops when L2-Guest CEDE is ON when compared to L2-Guest CEDE is OFF. It is well understood that the solution cannot be offered with Shared CEDE disabled. However, it would be ideal to reduce the aggressiveness of CEDE'ing to scale to higher performance which is acceptable. ......................................................................... The patch for this fix has been merged into upstream kernel via commit 7be6ce7043b4cf293c8826a48fd9f56931cef2cf("KVM: PPC: Book3S HV nestedv2: Cancel pending DEC exception") To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/2070253/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp