Re: [zfs-discuss] time-sliderd doesn't remove snapshots

Bill Shannon Fri, 18 Feb 2011 12:19:14 -0800

One of my old pools was version 10, another was version 13.
I guess that explains the problem.


Seems like time-sliderd should refuse to run on pools that
aren't of a sufficient version.


Cindy Swearingen wrote on 02/18/11 12:07 PM:

Hi Bill,

I think the root cause of this problem is that time slider implemented
the zfs destroy -d feature but this feature is only available in later
pool versions. This means that the routine removal of time slider
generated snapshots fails on older pool versions.

The zfs destroy -d feature (snapshot user holds) was introduced in pool
version 18.

I think this bug describes some or all of the problem:

https://defect.opensolaris.org/bz/show_bug.cgi?id=16361

Thanks,

Cindy



On 02/18/11 12:34, Bill Shannon wrote:

In the last few days my performance has gone to hell.  I'm running:

# uname -a
SunOS nissan 5.11 snv_150 i86pc i386 i86pc

(I'll upgrade as soon as the desktop hang bug is fixed.)

The performance problems seem to be due to excessive I/O on the main
disk/pool.

The only things I've changed recently is that I've created and destroyed
a snapshot, and I used "zpool upgrade".

Here's what I'm seeing:

# zpool iostat rpool 5
                  capacity     operations    bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
rpool       13.3G   807M      7     85  15.9K   548K
rpool       13.3G   807M      3     89  1.60K   723K
rpool       13.3G   810M      5     91  5.19K   741K
rpool       13.3G   810M      3     94  2.59K   756K

Using iofileb.d from the dtrace toolkit shows:

# iofileb.d
Tracing... Hit Ctrl-C to end.
^C
      PID CMD              KB FILE
        0 sched             6<none>
        5 zpool-rpool    7770<none>

zpool status doesn't show any problems:

# zpool status rpool
     pool: rpool
    state: ONLINE
    scan: none requested
config:

           NAME        STATE     READ WRITE CKSUM
           rpool       ONLINE       0     0     0
             c3d0s0    ONLINE       0     0     0


Perhaps related to this or perhaps not, I discovered recently that
time-sliderd
was doing just a ton of "close" requests.  I disabled time-sliderd while
trying
to solve my performance problem.

I was also getting these error messages in the time-sliderd log file:

Warning: Cleanup failed to destroy:
rpool/ROOT@zfs-auto-snap_hourly-2010-11-10-15h01
Details:
['/usr/bin/pfexec', '/usr/sbin/zfs', 'destroy', '-d',
'rpool/ROOT@zfs-auto-snap_hourly-2010-11-10-15h01'] failed with exit code 1
cannot destroy 'rpool/ROOT@zfs-auto-snap_hourly-2010-11-10-15h01':
unsupported version

That was the reason I did the zpool upgrade.

I discovered that I had a *ton* of snapshots from time-slider that
hadn't been destroyed, over 6500 of them, presumably all because of this
version problem?

I manually removed all the snapshots and my performance returned to normal.

I don't quite understand what the "-d" option to "zfs destroy" does.
Why does time-sliderd use it, and why does it prevent these snapshots
from being destroyed?

Shouldn't time-sliderd detect that it can't destroy any of the snapshots
it's created and stop creating snapshots?

And since I don't quite understand why time-sliderd was failing to begin
with,
I'm nervous about re-enabling it.  Do I need to do a "zpool upgrade" on all
my pools to make it work?
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] time-sliderd doesn't remove snapshots

Reply via email to