One of my old pools was version 10, another was version 13.
I guess that explains the problem.
Seems like time-sliderd should refuse to run on pools that
aren't of a sufficient version.
Cindy Swearingen wrote on 02/18/11 12:07 PM:
Hi Bill,
I think the root cause of this problem is that time slider implemented
the zfs destroy -d feature but this feature is only available in later
pool versions. This means that the routine removal of time slider
generated snapshots fails on older pool versions.
The zfs destroy -d feature (snapshot user holds) was introduced in pool
version 18.
I think this bug describes some or all of the problem:
https://defect.opensolaris.org/bz/show_bug.cgi?id=16361
Thanks,
Cindy
On 02/18/11 12:34, Bill Shannon wrote:
In the last few days my performance has gone to hell. I'm running:
# uname -a
SunOS nissan 5.11 snv_150 i86pc i386 i86pc
(I'll upgrade as soon as the desktop hang bug is fixed.)
The performance problems seem to be due to excessive I/O on the main
disk/pool.
The only things I've changed recently is that I've created and destroyed
a snapshot, and I used "zpool upgrade".
Here's what I'm seeing:
# zpool iostat rpool 5
capacity operations bandwidth
pool alloc free read write read write
---------- ----- ----- ----- ----- ----- -----
rpool 13.3G 807M 7 85 15.9K 548K
rpool 13.3G 807M 3 89 1.60K 723K
rpool 13.3G 810M 5 91 5.19K 741K
rpool 13.3G 810M 3 94 2.59K 756K
Using iofileb.d from the dtrace toolkit shows:
# iofileb.d
Tracing... Hit Ctrl-C to end.
^C
PID CMD KB FILE
0 sched 6<none>
5 zpool-rpool 7770<none>
zpool status doesn't show any problems:
# zpool status rpool
pool: rpool
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
c3d0s0 ONLINE 0 0 0
Perhaps related to this or perhaps not, I discovered recently that
time-sliderd
was doing just a ton of "close" requests. I disabled time-sliderd while
trying
to solve my performance problem.
I was also getting these error messages in the time-sliderd log file:
Warning: Cleanup failed to destroy:
rpool/ROOT@zfs-auto-snap_hourly-2010-11-10-15h01
Details:
['/usr/bin/pfexec', '/usr/sbin/zfs', 'destroy', '-d',
'rpool/ROOT@zfs-auto-snap_hourly-2010-11-10-15h01'] failed with exit code 1
cannot destroy 'rpool/ROOT@zfs-auto-snap_hourly-2010-11-10-15h01':
unsupported version
That was the reason I did the zpool upgrade.
I discovered that I had a *ton* of snapshots from time-slider that
hadn't been destroyed, over 6500 of them, presumably all because of this
version problem?
I manually removed all the snapshots and my performance returned to normal.
I don't quite understand what the "-d" option to "zfs destroy" does.
Why does time-sliderd use it, and why does it prevent these snapshots
from being destroyed?
Shouldn't time-sliderd detect that it can't destroy any of the snapshots
it's created and stop creating snapshots?
And since I don't quite understand why time-sliderd was failing to begin
with,
I'm nervous about re-enabling it. Do I need to do a "zpool upgrade" on all
my pools to make it work?
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss