Re: [zfs-discuss] ZFS write bursts cause short app stalls

Saso Kiselkov Sat, 26 Dec 2009 09:22:00 -0800

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Thanks for the advice. I did an in-place upgrade to the latest
development b130 release and it seems that the change in scheduling
classes for the kernel writer threads worked (not even having to fiddle
around with logbias) - now I'm just getting small delays every 60
seconds (on the order of 20-30ms). I'm not sure these have something to
do with ZFS, though... they happen outside of the write bursts.


Thank you all for the valuable advice!

Regards,
- --
Saso

Richard Elling wrote:
> 
> On Dec 26, 2009, at 1:10 AM, Saso Kiselkov wrote:
> 
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Brent Jones wrote:
>>> On Fri, Dec 25, 2009 at 9:56 PM, Tim Cook <t...@cook.ms> wrote:
>>>>
>>>> On Fri, Dec 25, 2009 at 11:43 PM, Brent Jones <br...@servuhome.net>
>>>> wrote:
>>>>>>>
>>>>>>
>>>>>> Hang on... if you've got 77 concurrent threads going, I don't see how
>>>>>> that's
>>>>>> a "sequential" I/O load.  To the backend storage it's going to
>>>>>> look like
>>>>>> the
>>>>>> equivalent of random I/O.  I'd also be surprised to see 12 1TB disks
>>>>>> supporting 600MB/sec throughput and would be interested in hearing
>>>>>> where
>>>>>> you
>>>>>> got those numbers from.
>>>>>>
>>>>>> Is your video capture doing 430MB or 430Mbit?
>>>>>>
>>>>>> -- 
>>>>>> --Tim
>>>>>>
>>>>>>
>>>>> Think he said 430Mbit/sec, which if these are security cameras, would
>>>>> be a good sized installation (30+ cameras).
>>>>> We have a similar system, albeit running on Windows. Writing about
>>>>> 400Mbit/sec using just 6, 1TB SATA drives is entirely possible, and
>>>>> working quite well on our system without any frame loss or much
>>>>> latency.
>>>> Once again, Mb or MB?  They're two completely different numbers.  As
>>>> for
>>>> getting 400Mbit out of 6 SATA drive, that's not really impressive at
>>>> all.
>>>> If you're saying you got 400MB, that's a different story entirely,
>>>> and while
>>>> possible with sequential I/O and a proper raid setup, it isn't
>>>> happening
>>>> with random.
>>>>
>>>
>>> Mb, megabit.
>>> 400 megabit is not terribly high, a single SATA drive could write that
>>> 24/7 without a sweat. Which is why he is reporting his issue.
>>>
>>> Sequential or random, any modern system should be able to perform that
>>> task without causing disruption to other processes running on the
>>> system (if Windows can, Solaris/ZFS most definitely should be able
>>> to).
>>>
>>> I have similar workload on my X4540's, streaming backups from multiple
>>> systems at a time. These are very high end machines, dual quadcore
>>> opterons and 64GB RAM, 48x 1TB drives in 5-6 disk RAIDZ vdevs.
>>>
>>> The "write stalls" have been a significant problem since ZFS came out,
>>> and hasn't really been addressed in an acceptable fashion yet, though
>>> work has been done to improve it.
> 
> PSARC case 2009/615 : System Duty Cycle Scheduling Class and ZFS IO
> Observability was integrated into b129. This creates a scheduling class
> for ZFS IO and automatically places the zio threads into that class.  This
> is not really an earth-shattering change, Solaris has had a very flexible
> scheduler for almost 20 years now. Another example is that on a desktop,
> the application which has mouse focus runs in the interactive scheduling
> class.  This is completely transparent to most folks and there is no
> tweaking
> required.
> 
> Also fixed in b129 is BUG/RFE:6881015ZFS write activity prevents other
> threads from running in a timely manner, which is related to the above.
> 
> 
>>> I'm still trying to find the case number I have open with Sunsolve or
>>> whatever, it was for exactly this issue, and I believe the fix was to
>>> add dozens more "classes" to the scheduler, to allow more fair disk
>>> I/O and overall "niceness" on the system when ZFS commits a
>>> transaction group.
>>
>> Wow, if there were a production-release solution to the problem, that
>> would be great! Reading the mailing list I almost gave up hope that I'd
>> be able to work around this issue without upgrading to the latest
>> bleeding-edge development version.
> 
> Changes have to occur someplace first.  In the OpenSolaris world,
> the changes occur first in the dev train and then are back ported to
> Solaris 10 (sometimes, not always).
> 
> You should try the latest build first -- be sure to follow the release
> notes.
> Then, if the problem persists, you might consider tuning zfs_txg_timeout,
> which can be done on a live system.
>  -- richard
> 

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAks2RfgACgkQRO8UcfzpOHDhCQCeIrJxcy4TcqgvPwGYm/f97NG9
ac8An2zTTqtz/KCK6a4IzKHzgYdEB0Qe
=9zO8
-----END PGP SIGNATURE-----
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS write bursts cause short app stalls

Reply via email to