** Tags added: verification-done
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2039575
Title:
SMC stats: Wrong bucket calculation for payload of exactly 4096 bytes
Status in Ubuntu on IBM z Systems:
Fix Committed
Status in linux package in Ubuntu:
New
Status in linux source package in Jammy:
Fix Committed
Status in linux source package in Lunar:
Fix Committed
Status in linux source package in Mantic:
Fix Committed
Bug description:
SRU Justification:
[Impact]
* There is a wrong bucket calculation for payload of exactly 4096 bytes
in SMC stats counters.
* SMC_STAT_PAYLOAD_SUB and SMC_STAT_RMB_SIZE_SUB have these issues.
* The impact is that a system silently updates an incorrect stats counter
in case the underlying kernel is not UBSAN enabled.
(Difficult to detect.)
* But if a kernel is UBSAN enabled, one will see a UBSAN
'array-index-out-of-bounds' warning every time this occurs, like:
[ 26.335381] UBSAN: array-index-out-of-bounds in
/build/linux-O6Qi7m/linux-5.15.0/net/smc/af_smc.c:2402:3
[ 26.335385] index -1 is out of range for type 'u64 [9]'
[ 26.335388] CPU: 0 PID: 274 Comm: iperf3 Tainted: G E 5.15.0-79-generic
#86-Ubuntu
[ 26.335391] Hardware name: IBM 8561 T01 772 (KVM/Linux)
[ 26.335393] Call Trace:
[ 26.335397] [<00000000cd92e63a>] dump_stack_lvl+0x62/0x80
[ 26.335404] [<00000000cd92e36c>] ubsan_epilogue+0x1c/0x48
[ 26.335406] [<00000000cd52d3c4>] __ubsan_handle_out_of_bounds+0x94/0xa0
[ 26.335411] [<000003ff8033f9da>] smc_sendmsg+0x2aa/0x2d0 [smc]
[ 26.335425] [<00000000cd6a79a4>] sock_sendmsg+0x64/0x80
[ 26.335431] [<00000000cd6a7a32>] sock_write_iter+0x72/0xa0
[ 26.335433] [<00000000cd1d4000>] new_sync_write+0x100/0x190
[ 26.335438] [<00000000cd1d4bb8>] vfs_write+0x1e8/0x280
[ 26.335440] [<00000000cd1d7014>] ksys_write+0xb4/0x100
[ 26.335442] [<00000000cd932c7c>] __do_syscall+0x1bc/0x1f0
[ 26.335446] [<00000000cd940148>] system_call+0x78/0xa0
[Fix]
* a950a5921db4 a950a5921db450c74212327f69950ff03419483a "net/smc: Fix
pos miscalculation in statistics"
[Test Plan]
* Since this got identified while the livepatch for jammy patches got added,
one could run a simiar (or the same) test like mentioned at LP#1639924
(but only jammy comes with official livepatch support).
* Alternatively a dedicated SMC stats counters test with a payload of
exactly 4096 bytes could be done (which is probably easier):
* Install uperf (https://uperf.org/ - https://github.com/uperf/uperf).
(Wondering if it makes sense to pick this up as Debian/Ubuntu package ?!)
* Reset SMC-D stats on client and server side.
* Start uperf at the server side using: # uperf -vs
* Update profile with remote IP (server IP)
and start uperf at client: # uperf -vai 5 -m rr1c-4kx4k---1.xml
with uperf profile:
# cat rr1c-4kx4k---1.xml
<?xml version="1.0"?>
<profile name="TCP_RR">
<group nprocs="1">
<!--group nthreads="1"-->
<!-- if we want to run processes -->
<!--group nprocs="1"-->
<transaction iterations="1">
<flowop type="connect" options="remotehost=<remote IP> protocol=tcp
tcp_nodelay" />
</transaction>
<transaction iterations="1">
<flowop type="write" options="size=4096"/>
<flowop type="read" options="size=4096"/>
</transaction>
<transaction iterations="1">
<flowop type="disconnect" />
</transaction>
</group>
</profile>
* The smcd stats output is:
# smcd -d stats reset
SMC-D Connections Summary
Total connections handled 2
SMC connections 2 (client 2, server 0)
v1 0
v2 2
Handshake errors 0 (client 0, server 0)
Avg requests per SMC conn 14.0
TCP fallback 0 (client 0, server 0)
RX Stats
Data transmitted (Bytes) 5796 (5.796K)
Total requests 9
Buffer full 0 (0.00%)
Buffer downgrades 0
Buffer reuses 0
8KB 16KB 32KB 64KB 128KB 256KB 512KB >512KB
Bufs 0 0 0 2 0 0 0 1
Reqs 8 0 0 0 0 0 0 0
TX Stats
Data transmitted (Bytes) 9960 (9.960K)
Total requests 19
Buffer full 0 (0.00%)
Buffer full (remote) 0 (0.00%)
Buffer too small 0 (0.00%)
Buffer too small (remote) 0 (0.00%)
Buffer downgrades 0
Buffer reuses 0
8KB 16KB 32KB 64KB 128KB 256KB 512KB >512KB
Bufs 0 2 0 0 0 0 0 0
Reqs 18 0 0 0 0 0 0 1
Extras
Special socket calls 0
cork 0
nodelay 0
sendpage 0
splice 0
urgent data 0
* Instead of including the payload in the wrong >512 KB buckets,
the output should be to have 19 reqs in the 8 KB buckets for TX stats
and 9 reqs in the 8 KB bucket for RX stats.
[Where problems could occur]
* The changes are in common code /net/smc/smc_stats.h
hence any potential negativ impact is not limited to a specific platform.
but limited to statistics for shared memory communication (SMC)
hardware statistics.
* But limited to the functions SMC_STAT_PAYLOAD_SUB and
SMC_STAT_RMB_SIZE_SUB.
* The bucket (aka pos) calculation is not that trivial,
issues here could cause other calculation error,
that may lead to slightly different, but still wrong numbers.
* Further issues can happen in case variables in use may overflow.
* Issues can also happen if the ternary conditional operators
are not correctly used, which again may lead to wrong calculations.
[Other Info]
* This patch fixes upstream e0e4b8fa5338
("net/smc: Add SMC statistics support").
* Since the fix got upstream accepted with v6.6-rc6,
not only 22.04, but also 23.04 and 23.10 are affected.
(22.10 already reached it EOS).
* Will not affected the current development release (N-release),
since it will at least be based on a 6.6 kernel.
__________
Bug description by Nils H.:
------------
Overview:
------------
The following line in the SMC stats code (net/smc/smc_stats.h) caught my
attention when using a payload of exactly 4096 bytes:
#define SMC_STAT_PAYLOAD_SUB(_smc_stats, _tech, key, _len, _rc) \
do { \
typeof(_smc_stats) stats = (_smc_stats); \
typeof(_tech) t = (_tech); \
typeof(_len) l = (_len); \
int _pos = fls64((l) >> 13); \
typeof(_rc) r = (_rc); \
int m = SMC_BUF_MAX - 1; \
this_cpu_inc((*stats).smc[t].key ## _cnt); \
if (r <= 0) \
break; \
_pos = (_pos < m) ? ((l == 1 << (_pos + 12)) ? _pos - 1 : _pos) : m; \ <---
this_cpu_inc((*stats).smc[t].key ## _pd.buf[_pos]); \
this_cpu_add((*stats).smc[t].key ## _bytes, r); \
} \
while (0)
With l = 4096, _pos evaluates to -1.
Checking with the following uperf profile:
# cat rr1c-4kx4k---1.xml
<?xml version="1.0"?>
<profile name="TCP_RR">
<group nprocs="1">
<!--group nthreads="1"-->
<!-- if we want to run processes -->
<!--group nprocs="1"-->
<transaction iterations="1">
<flowop type="connect" options="remotehost=<remote IP> protocol=tcp
tcp_nodelay" />
</transaction>
<transaction iterations="1">
<flowop type="write" options="size=4096"/>
<flowop type="read" options="size=4096"/>
</transaction>
<transaction iterations="1">
<flowop type="disconnect" />
</transaction>
</group>
</profile>
smcd stats output:
# smcd -d stats reset
SMC-D Connections Summary
Total connections handled 2
SMC connections 2 (client 2, server 0)
v1 0
v2 2
Handshake errors 0 (client 0, server 0)
Avg requests per SMC conn 14.0
TCP fallback 0 (client 0, server 0)
RX Stats
Data transmitted (Bytes) 5796 (5.796K)
Total requests 9
Buffer full 0 (0.00%)
Buffer downgrades 0
Buffer reuses 0
8KB 16KB 32KB 64KB 128KB 256KB 512KB >512KB
Bufs 0 0 0 2 0 0 0 1
Reqs 8 0 0 0 0 0 0 0
TX Stats
Data transmitted (Bytes) 9960 (9.960K)
Total requests 19
Buffer full 0 (0.00%)
Buffer full (remote) 0 (0.00%)
Buffer too small 0 (0.00%)
Buffer too small (remote) 0 (0.00%)
Buffer downgrades 0
Buffer reuses 0
8KB 16KB 32KB 64KB 128KB 256KB 512KB >512KB
Bufs 0 2 0 0 0 0 0 0
Reqs 18 0 0 0 0 0 0 1
Extras
Special socket calls 0
cork 0
nodelay 0
sendpage 0
splice 0
urgent data 0
Instead of including the payload in the wrong >512KB buckets, output
should be to have 19 reqs in the 8KB buckets for TX stats and 9 reqs
in the 8KB bucket for RX stats.
--------
Repro:
--------
0. Install uperf.
1. Reset SMC-D stats on client and server.
2. Start uperf at server side: "uperf -vs".
3. Update profile with remote IP (server IP) and start uperf at client:
"uperf -vai 5 -m rr1c-4kx4k---1.xml" (uperf profile, see above)
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/2039575/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp