Hi,

it's likely a layer 8 problem but I can't figure out what I am doing
wrong.

I am debugging the nvme state machine with a nvme-tcp setup. I attached
a kprobe to nvme_change_ctrl_state to trace all the state changes, but
not all of them are logged. It's consistent behavior and I tried
different things with the same result.

The state transition I'd like to see is

  LIVE (1) -> RESETTING (2) -> CONNECTING (3) -> LIVE (1)

Though the 1 -> 2 transition is always missing, though a printk says it
is happening.

cd /sys/kernel/debug/tracing
echo "r:nvme_change_ctrl_state nvme_change_ctrl_state new_state=\$arg2:x32" > 
kprobe_events
echo 1 > events/kprobes/enable
echo 1 > tracing_on
cat trace_pipe

            nvme-570     [004] .....    97.559620: nvme_change_ctrl_state: 
(nvme_tcp_setup_ctrl+0x324/0x720 [nvme_tcp] <- nvme_change_ctrl_state) 
new_state=0x1
           <...>-65      [000] .....   143.540287: nvme_change_ctrl_state: 
(nvme_reset_ctrl_work+0x76/0xc0 [nvme_tcp] <- nvme_change_ctrl_state) 
new_state=0x3
   kworker/u34:3-76      [001] .....   170.236138: nvme_change_ctrl_state: 
(nvme_tcp_setup_ctrl+0x324/0x720 [nvme_tcp] <- nvme_change_ctrl_state) 
new_state=0x1


[   97.548827] nvme_change_ctrl_state:566 3 -> 1
[   97.550547] nvme nvme1: new ctrl: NQN "nqn.io-1", addr 192.168.154.25:4420, 
hostnqn: nqn.2014-08.org.nvmexpress:uuid:befdec4c-2234-11b2-a85c-ca77c773af36
[  133.273552] nvme nvme1: I/O tag 1 (e001) type 4 opcode 0x18 (Keep Alive) QID 
0 timeout
[  133.274359] nvme nvme1: long keepalive RTT (5056 ms)
[  133.274684] nvme nvme1: failed nvme_keep_alive_end_io error=10
[  143.514090] nvme_change_ctrl_state:566 1 -> 2
[  143.529185] nvme_change_ctrl_state:566 2 -> 3
[  146.584904] nvme nvme1: failed to connect socket: -110
[  146.585553] nvme nvme1: Reconnecting in 10 seconds...
[  159.897184] nvme nvme1: failed to connect socket: -110
[  159.897633] nvme nvme1: Failed reconnect attempt 2/60
[  159.898006] nvme nvme1: Reconnecting in 10 seconds...
[  170.153915] nvme nvme1: creating 8 I/O queues.

The system is idle, so I hope it's not just a dropped event. Any ideas
what I am doing wrong?

Linux 6.13.0-rc3+, clang compiler.

Thanks,
Daniel

Reply via email to