Re: [zfs-discuss] [osol-help] zfs destroy stalls, need to hard reboot

Brent Jones Tue, 29 Dec 2009 00:35:57 -0800

On Sun, Dec 27, 2009 at 1:35 PM, Brent Jones <br...@servuhome.net> wrote:
> On Sun, Dec 27, 2009 at 12:55 AM, Stephan Budach <stephan.bud...@jvm.de> 
> wrote:
>> Brent,
>>
>> I had known about that bug a couple of weeks ago, but that bug has been 
>> files against v111 and we're at v130. I have also seached the ZFS part of 
>> this forum and really couldn't find much about this issue.
>>
>> The other issue I noticed is that, as opposed to the statements I read, that 
>> once zfs is underway destroying a big dataset, other operations would 
>> continue to work, but that doesen't seem to be the case. When destroying the 
>> 3 TB dataset, the other zvol that had been exported via iSCSI stalled as 
>> well and that's really bad.
>>
>> Cheers,
>> budy
>> --
>> This message posted from opensolaris.org
>> _______________________________________________
>> opensolaris-help mailing list
>> opensolaris-h...@opensolaris.org
>>
>
> I just tested your claim, and you appear to be correct.
>
> I created a couple dummy ZFS filesystems, loaded them with about 2TB,
> exported them via CIFS, and destroyed one of them.
> The destroy took the usual amount of time (about 2 hours), and
> actually, quite to my surprise, all I/O on the ENTIRE zpool stalled.
> I dont recall seeing this prior to 130, in fact, I know I would have
> noticed this, as we create and destroy large ZFS filesystems very
> frequently.
>
> So it seems the original issue I reported many months back has
> actually gained some new negative impacts  :(
>
> I'll try to escalate this with my Sun support contract, but Sun
> support still isn't very familiar/clued in about OpenSolaris, so I
> doubt I will get very far.
>
> Cross posting to ZFS-discuss also, as other may have seen this and
> know of a solution/workaround.
>
>
>
> --
> Brent Jones
> br...@servuhome.net
>


I did some more testing, and it seems this is 100% reproducible ONLY
if the file system and/or entire pool had compression or de-dupe
enabled at one point.
It doesn't seem to matter if de-dupe/compression was enabled for 5
minutes, or the entire life of the pool, as soon as either are turned
on in snv_130, doing any type of mass change (like deleting a big file
system) will hang ALL I/O for a significant amount of time.

If I create a filesystem with neither enabled, fill it with a few TB
of data, and do a 'zfs destroy' on it, it'll go pretty quick, just a
couple minutes, and no noticeable impact to system I/O.

I'm curious about the 7000 series appliances, since those supposedly
ship now with de-dupe as a fully supported option. Is the code
significantly different in the core of ZFS on the 7000 appliances than
a recent build of OpenSolaris?
My sales rep assures me theres very little overhead by enabling
de-dupe on the 7000 series (which he's trying to sell us, obviously)
but I can't see how that could be, when I have the same hardware the
7000's run on (fully loaded X4540).

Any thoughts from anyone?

-- 
Brent Jones
br...@servuhome.net
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] [osol-help] zfs destroy stalls, need to hard reboot

Reply via email to