Re: [zfs-discuss] ZFS snapshot send/recv "hangs" X4540 servers

Ian Collins Tue, 30 Jun 2009 21:46:22 -0700

Tim Haley wrote:

Ian Collins wrote:
Ian Collins wrote:
Tim Haley wrote:
Brent Jones wrote:
On the sending side, I CAN kill the ZFS send process, but the remote
side leaves its processes going, and I CANNOT kill -9 them. I also
cannot reboot the receiving system, at init 6, the system will just
hang trying to unmount the file systems.
I have to physically cut power to the server, but a couple dayslater,
this issue will occur again.
A crash dump from the receiving server with the stuck receiveswould be highly useful, if you can get it. Reboot -d would be best,but it might just hang. You can try savecore -L.
I tried a reboot -d (I even had kmem-flags=0xf set), but it didhang. I didn't try savecore.
One thing I didn't try was scat on the running system. What should Ilook for (with scat) if this happens again?
I now have a system with a hanging zfs receive, any hints ondebugging it?
If you've got it stuck, but can still do things on the console, then
run 'mdb -K' on the console and type '::stacks -m zfs'. That willsummarize all threads running in the kernel related to zfs. Perhapsthere will be a clue in the stacks of the receive(s) as to where theyare stuck.

I've seen this again on Solaris 10u7.I'm doing a couple of full sends from one pool to another on the samehost. One of the sends stopped while the other completed. All zfscommands on the source pool now hang, the destination pool is OK.


::stacks isn't recognised by mdb on Solaris 10, is there an alternative?

Also, make sure the spa is healthy and not suspended. This is anexample on one of my machines.

There were.

--
Ian.

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS snapshot send/recv "hangs" X4540 servers

Reply via email to