On Apr 11, 2011, at 3:22 PM, Paul Kraus wrote:

> On Wed, Apr 6, 2011 at 1:58 PM, Rich Morris <rich.mor...@oracle.com> wrote:
>> On 04/06/11 12:43, Paul Kraus wrote:
>>> 
>>> xxx> zfs holds zpool-01/dataset-01@1299636001
>>> NAME                               TAG            TIMESTAMP
>>> zpool-01/dataset-01@1299636001  .send-18440-0  Tue Mar 15 20:00:39 2011
>>> xxx> zfs holds zpool-01/dataset-01@1300233615
>>> NAME                               TAG            TIMESTAMP
>>> zpool-01/dataset-01@1300233615  .send-18440-0  Tue Mar 15 20:00:47 2011
>>> xxx>
>>> 
>>>    That is what I was looking for. Looks like when a zfs send got
>>> killed it left a hanging lock (hold) around. I assume the next
>>> export/import (not likely as this is a production zpool) or a reboot
>>> (will happen eventually, and I can wait) these will clear. Unless
>>> there is a way to force clear the hold.
>> 
>> The user holds won't be released by an export/import or a reboot.
>> 
>> "zfs get defer_destroy snapname" will show whether this snapshot is marked
>> for
>> deferred destroy and "zfs release .send-18440-0 snapname" will clear that
>> hold.
>> If the snapshot is marked for deferred destroy then the release of the last
>> tag
>> will also destroy it.
> 
>    Sorry I did not get back on this last week, it got busy late in the week.
> 
>    I tried the `zfs release` and it appeared to hang, so I just let
> it be. A few hours later the server experienced a resource crunch of
> some type (fork errors about unable to allocate resources). The load
> also varied between about 16 and 50 (it is a 16 CPU M4000).
> 
>    Users who had an open SAMBA connection seemed OK, but eventually
> we needed to reboot the box (I did let it sit in that state as long as
> I could). Since I could not even get on the XSCF console, I had to
> `break` it to the OK prompt and sync it. The first boot hung. I then
> did a boot -rv and that also hung (I was hoping to see a device probe
> that caused the hang, but it looked like it was getting past all the
> device discovery). That also hung. Finally a boot -srv got me to a
> login prompt. I logged in as root, then logged out and it came up to
> mulltiuser-server without a hitch.
> 
>    I do not know what the root cause of the initial resource problem
> was, as I did not get a good core dump. I *hope* it was not the `zfs
> release`, but it may have been.
> 
>    After the boot cycle(s) the zfs snapshots are no longer held and I
> could destroy them.
> 
>    Thanks to all those who helped. This discussion is one of the best
> sources, if not THE best source, of zfs support and knowledge.


I hate to drudge up this "old" email thread, but I just wanted to:

a) say thanks ("thanks!") as I had exactly this same issue just crop up on 
Sol10u9 (zpool rev22) and sure enough, it had a hold from a previous send.

b) mention (for those that may find this thread in the future) that once I 
found the hold, the "zfs release [hold] [snapname]" method mentioned above 
worked swimmingly for me. I was nervous doing this during production hours, but 
the release command returned in about 5-7 seconds with no apparent adverse 
effects. I was then able to destroy the snap.

I was initially afraid that it was somehow the "memory bug" mentioned in the 
current thread (when things are fresh in your mind, they seem more likely), so 
I'm glad this thread was out there.

matt
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to