Matthew Ahrens wrote:
Joseph Barbey wrote:
Robert Milkowski wrote:
JB> So, normally, when the script runs, all snapshots finish in maybe
a minute
JB> total. However, on Sundays, it continues to take longer and
longer. On
JB> 2/25 it took 30 minutes, and this last Sunday, it took 2:11. The
only
JB> thing special thing about Sunday's snapshots is that they are the
first
JB> ones created since the full backup (using NetBackup) on Saturday.
All
JB> other backups are incrementals.
hmmmmm do you have atime property set to off?
Maybe you spend most of the time in destroying snapshots due to much
larger delta coused by atime updates? You can possibly also gain some
performance by setting atime to off.
Yep, atime is set to off for all pools and filesystems. I looked
through the other possible properties, and nothing really looked like
it would really affect things.
One additional weird thing. My script hits each filesystem
(email-pool/A..Z) individually, so I can run zfs list -t snapshot and
find out how long each snapshot actually takes. Everything runs fine
until I get to around V or (normally) W. Then it can take a couple of
hours on the one FS. After that, the rest go quickly.
So, what operation exactly is taking "a couple of hours on the one FS"?
The only one I can imagine taking more than a minute would be 'zfs
destroy', but even that should be very rare on a snapshot. Is it always
the same FS that takes longer than the rest? Is the pool busy when you
do the slow operation?
I've now determined that renaming the previous snapshot seems to be the
problem in certain instances.
What we are currently doing through the script is to keep 2 weeks of daily
snapshots of the various pool/filesystems. These snapshots are named
{fs}.$Day-2, {fs}.$Day-2, and {fs}.snap. Specifically, for our 'V'
filesystem, which is created under the email-pool, I will have the
following snapshots:
email-pool/[EMAIL PROTECTED]
email-pool/[EMAIL PROTECTED]
email-pool/[EMAIL PROTECTED]
email-pool/[EMAIL PROTECTED]
email-pool/[EMAIL PROTECTED]
email-pool/[EMAIL PROTECTED]
email-pool/[EMAIL PROTECTED]
email-pool/[EMAIL PROTECTED]
email-pool/[EMAIL PROTECTED]
email-pool/[EMAIL PROTECTED]
email-pool/[EMAIL PROTECTED]
email-pool/[EMAIL PROTECTED]
email-pool/[EMAIL PROTECTED]
email-pool/[EMAIL PROTECTED]
So, my script does the following for each FS:
Check for FS.$Day-2. If exists, then destroy it.
Check if there is a FS.$Day-1. If so, rename it to $DAY-2.
Check for FS.snap. If so, rename to FS.$Yesterday-1 (day it was created).
Create FS.snap
I added logging to a file, along with the action just run and the time that
it completed:
Destroy email-pool/[EMAIL PROTECTED] Sun Apr 8 00:01:04 CDT 2007
Rename email-pool/[EMAIL PROTECTED] email-pool/[EMAIL PROTECTED] Sun Apr 8
00:01:05 CDT 2007
Rename email-pool/[EMAIL PROTECTED] email-pool/[EMAIL PROTECTED] Sun Apr 8
00:54:52 CDT 2007
Create email-pool/[EMAIL PROTECTED] Sun Apr 8 00:54:53 CDT 2007
Looking at the above, Rename took from 00:01:05 until 00:54:52, so almost
54 minutes.
So, any ideas on why a rename should take so long? And again, why is this
only happening on Sunday? Any other information I can provide that might
help diagnose this?
Thanks again for any help on this.
--
Joe Barbey IT Services/Network Services
office: (715) 425-4357 Davee Library room 166C
cell: (715) 821-0008 UW - River Falls
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss