Hello all, I had been running snv_106 for about 3 or 4 months on a pair of X4540's. I would ship snapshots from the primary server to the secondary server nightly, which was working really well.
However, I have upgraded to 2009.06, and my replication scripts appear to "hang" when performing zfs send/recv. When one zfs send/recv process hangs, you cannot send any other snapshots from any other filesystem to the remote host. I have about 20 file systems I snapshots and replicate nightly. The script I use to perform the snapshots is here: http://www.brentrjones.com/wp-content/uploads/2009/03/replicate.ksh On the remote side, I end up with many "hung" processes, like this: bjones 11676 11661 0 01:30:03 ? 0:00 /sbin/zfs recv -vFd pdxfilu02 bjones 11673 11660 0 01:30:03 ? 0:00 /sbin/zfs recv -vFd pdxfilu02 bjones 11664 11653 0 01:30:03 ? 0:00 /sbin/zfs recv -vFd pdxfilu02 bjones 13727 13722 0 14:21:20 ? 0:00 /sbin/zfs recv -vFd pdxfilu02 And so on, one for each file system. On the receiving end, 'zfs list' shows one filesystem attempting to receive a snapshot, but I cannot stop it: $ zfs list NAME USED AVAIL REFER MOUNTPOINT pdxfilu02/data/fs01/%20090605-00:30:00 1.74G 27.2T 208G /pdxfilu02/data/fs01/%20090605-00:30:00 On the sending side, I CAN kill the ZFS send process, but the remote side leaves its processes going, and I CANNOT kill -9 them. I also cannot reboot the receiving system, at init 6, the system will just hang trying to unmount the file systems. I have to physically cut power to the server, but a couple days later, this issue will occur again. I'f I boot to my snv_106 BE, everything works fine, this issue has never occurred on that version. Any thoughts? -- Brent Jones br...@servuhome.net _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss