On Thu, Mar 4, 2010 at 7:28 PM, Ian Collins <i...@ianshome.com> wrote:
> Gary Mills wrote: > >> We have an IMAP e-mail server running on a Solaris 10 10/09 system. >> It uses six ZFS filesystems built on a single zpool with 14 daily >> snapshots. Every day at 11:56, a cron command destroys the oldest >> snapshots and creates new ones, both recursively. For about four >> minutes thereafter, the load average drops and I/O to the disk devices >> drops to almost zero. Then, the load average shoots up to about ten >> times normal and then declines to normal over about four minutes, as >> disk activity resumes. The statistics return to their normal state >> about ten minutes after the cron command runs. >> >> Is it destroying old snapshots or creating new ones that causes this >> dead time? What does each of these procedures do that could affect >> the system? What can I do to make this less visible to users? >> >> >> > I have a couple of Solaris 10 boxes that do something similar (hourly > snaps) and I've never seen any lag in creating and destroying snapshots. > One system with 16 filesystems takes 5 seconds to destroy the 16 oldest > snaps and create 5 recursive new ones. I logged load average on these boxes > and there is a small spike on the hour, but this is down to sending the > snaps, not creating them. > We've seen the behaviour that Gary describes while destroying datasets recursively (>600GB and with 7 snapshots). It seems that close to the end the server stalls for 10-15 minutes and NFS activity stops. For small datasets/snapshots that doesn't happen or is harder to notice. Does ZFS have to do something special when it's done releasing the data blocks at the end of the destroy operation ? -- Giovanni Tirloni sysdroid.com
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss