Running procstat -kk will display on what syscall a process/thread
is blocking.
That might give some valuable information to people.
Regards,
Ronald.
On Thu, 24 Aug 2017 22:01:25 +0200, Mike Tancsa wrote:
OK, this is fairly easy to repeat. If I start a sync of a snapshot via
zrep, it hangs the box. CTRL+T shows
DEBUG: overiding stale lock on zroot/chyves from pid 19378
sending zroot/chyves@zrep_10 to 10.151.9.2:zroot/chyves
cannot receive new filesystem stream: destination
'zroot/chyves/guests/resi/disk1' exists
must specify -F to overwrite it
^C
load: 0.48 cmd: zfs 29690 [tx->tx_sync_done_cv] 358.94r 0.00u 0.00s 0%
3476k
load: 0.48 cmd: zfs 29690 [tx->tx_sync_done_cv] 360.42r 0.00u 0.00s 0%
3476k
load: 0.48 cmd: zfs 29690 [tx->tx_sync_done_cv] 360.79r 0.00u 0.00s 0%
3476k
load: 0.48 cmd: zfs 29690 [tx->tx_sync_done_cv] 360.99r 0.00u 0.00s 0%
3476k
load: 0.48 cmd: zfs 29690 [tx->tx_sync_done_cv] 361.19r 0.00u 0.00s 0%
3476k
load: 0.48 cmd: zfs 29690 [tx->tx_sync_done_cv] 361.37r 0.00u 0.00s 0%
3476k
load: 0.48 cmd: zfs 29690 [tx->tx_sync_done_cv] 361.55r 0.00u 0.00s 0%
3476k
load: 0.48 cmd: zfs 29690 [tx->tx_sync_done_cv] 361.74r 0.00u 0.00s 0%
3476k
load: 0.48 cmd: zfs 29690 [tx->tx_sync_done_cv] 361.92r 0.00u 0.00s 0%
3476k
load: 0.48 cmd: zfs 29690 [tx->tx_sync_done_cv] 362.11r 0.00u 0.00s 0%
3476k
load: 0.48 cmd: zfs 29690 [tx->tx_sync_done_cv] 362.31r 0.00u 0.00s 0%
3476k
load: 0.48 cmd: zfs 29690 [tx->tx_sync_done_cv] 362.50r 0.00u 0.00s 0%
3476k
load: 0.48 cmd: zfs 29690 [tx->tx_sync_done_cv] 362.69r 0.00u 0.00s 0%
3476k
load: 0.48 cmd: zfs 29690 [tx->tx_sync_done_cv] 362.90r 0.00u 0.00s 0%
3476k
load: 0.52 cmd: zfs 29690 [tx->tx_sync_done_cv] 363.08r 0.00u 0.00s 0%
3476k
On 8/24/2017 11:48 AM, Mike Tancsa wrote:
I upgraded a server yesterday from RELENG_11 from march 2017 to r322800
(Aug 22) and noticed that under heavy disk IO in a VM, the server is
locking up. In the vm, I was doing a large untar and I noticed that
prior to the lockup, the hypervisor would be struggling to keep up the
disk writes. The VM is on a zvol if that makes any difference. A few
times in the VM, IO would be clogged to the point that the disk would
timeout in the VM
Aug 24 08:32:02 kernel: ahcich6: Timeout on slot 14 port 0
Aug 24 08:32:02 kernel: ahcich6: is cs ss rs
tfd 50 serr cmd 0001db17
Aug 24 08:32:02 kernel: (ada1:ahcich6:0:0:0): WRITE_FPDMA_QUEUED. ACB:
61 00 a8 47 d8 40 01 00 00 01 00 00
Aug 24 08:32:02 kernel: (ada1:ahcich6:0:0:0): CAM status: Command
timeout
Aug 24 08:32:02 kernel: (ada1:ahcich6:0:0:0): Retrying command
When the parent deadlocks, I cant run anything thats not already in RAM.
shutdown doesnt work and I have to reboot the box via IPMI.
Any ideas how to debug this or try and better understand the problem so
I can at least work around it ?
---Mike
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"