Hello Jiri, On Thu, Jun 13, 2024 at 1:08 AM Jiri Pirko <j...@resnulli.us> wrote: > > From: Jiri Pirko <j...@nvidia.com> > > Add support for Byte Queue Limits (BQL). > > Tested on qemu emulated virtio_net device with 1, 2 and 4 queues. > Tested with fq_codel and pfifo_fast. Super netperf with 50 threads is > running in background. Netperf TCP_RR results: > > NOBQL FQC 1q: 159.56 159.33 158.50 154.31 agv: 157.925 > NOBQL FQC 2q: 184.64 184.96 174.73 174.15 agv: 179.62 > NOBQL FQC 4q: 994.46 441.96 416.50 499.56 agv: 588.12 > NOBQL PFF 1q: 148.68 148.92 145.95 149.48 agv: 148.2575 > NOBQL PFF 2q: 171.86 171.20 170.42 169.42 agv: 170.725 > NOBQL PFF 4q: 1505.23 1137.23 2488.70 3507.99 agv: 2159.7875 > BQL FQC 1q: 1332.80 1297.97 1351.41 1147.57 agv: 1282.4375 > BQL FQC 2q: 768.30 817.72 864.43 974.40 agv: 856.2125 > BQL FQC 4q: 945.66 942.68 878.51 822.82 agv: 897.4175 > BQL PFF 1q: 149.69 151.49 149.40 147.47 agv: 149.5125 > BQL PFF 2q: 2059.32 798.74 1844.12 381.80 agv: 1270.995 > BQL PFF 4q: 1871.98 4420.02 4916.59 13268.16 agv: 6119.1875
I cannot get such a huge improvement when I was doing multiple tests between two VMs. I'm pretty sure the BQL feature is working, but the numbers look the same with/without BQL. VM 1 (client): 16 cpus, x86_64, 4 queues, the latest net-next kernel with/without this patch, pfifo_fast, napi_tx=true, napi_weight=128 VM 2 (server): 16 cpus, aarch64, 4 queues, the latest net-next kernel without this patch, pfifo_fast What the 'ping' command shows to me between two VMs is : rtt min/avg/max/mdev = 0.233/0.257/0.300/0.024 ms I started 50 netperfs to communicate the other side with the following command: #!/bin/bash for i in $(seq 5000 5050); do netperf -p $i -H [ip addr] -l 60 -t TCP_RR -- -r 64,64 > /dev/null 2>&1 & done The results are around 30423.62 txkB/s. If I remove '-r 64 64', they are still the same/similar. Am I missing something? Thanks, Jason