Re: [zfs-discuss] ZFS write bursts cause short app stalls

Brent Jones Fri, 25 Dec 2009 21:44:12 -0800

On Fri, Dec 25, 2009 at 7:47 PM, Tim Cook <t...@cook.ms> wrote:
>
>
> On Fri, Dec 25, 2009 at 11:57 AM, Saso Kiselkov <skisel...@gmail.com> wrote:
>>
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> I've started porting a video streaming application to opensolaris on
>> ZFS, and am hitting some pretty weird performance issues. The thing I'm
>> trying to do is run 77 concurrent video capture processes (roughly
>> 430Mbit/s in total) all writing into separate files on a 12TB J4200
>> storage array. The disks in the array are arranged into a single RAID-0
>> ZFS volume (though I've tried different RAID levels, none helped). CPU
>> performance is not an issue (barely hitting 35% utilization on a single
>> CPU quad-core X2250). I/O bottlenecks can also be ruled out, since the
>> storage array's sequential write performance is around 600MB/s.
>>
>> The problem is the bursty behavior of ZFS writes. All the capture
>> processes do, in essence is poll() on a socket and then read() and
>> write() any available data from it to a file. The poll() call is done
>> with a timeout of 250ms, expecting that if no data arrives within 0.25
>> seconds, the input is dead and recording stops (I tried increasing this
>> value, but the problem still arises, although not as frequently). When
>> ZFS decides that it wants to commit a transaction group to disk (every
>> 30 seconds), the system stalls for a short amount of time and depending
>> on the number capture of processes currently running, the poll() call
>> (which usually blocks for 1-2ms), takes on the order of hundreds of ms,
>> sometimes even longer. I figured that I might be able to resolve this by
>> lowering the txg timeout to something like 1-2 seconds (I need ZFS to
>> write as soon as data arrives, since it will likely never be
>> overwritten), but I couldn't find any tunable parameter for it anywhere
>> on the net. On FreeBSD, I think this can be done via the
>> vfs.zfs.txg_timeout sysctl. A glimpse into the source at
>>
>> http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/zfs/txg.c
>> on line 40 made me worry that somebody maybe hard-coded this value into
>> the kernel, in which case I'd be pretty much screwed in opensolaris.
>>
>> Any help would be greatly appreciated.
>>
>> Regards,
>> - --
>> Saso
>>
>>
>
>
> Hang on... if you've got 77 concurrent threads going, I don't see how that's
> a "sequential" I/O load.  To the backend storage it's going to look like the
> equivalent of random I/O.  I'd also be surprised to see 12 1TB disks
> supporting 600MB/sec throughput and would be interested in hearing where you
> got those numbers from.
>
> Is your video capture doing 430MB or 430Mbit?
>
> --
> --Tim
>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
>


Think he said 430Mbit/sec, which if these are security cameras, would
be a good sized installation (30+ cameras).
We have a similar system, albeit running on Windows. Writing about
400Mbit/sec using just 6, 1TB SATA drives is entirely possible, and
working quite well on our system without any frame loss or much
latency.

The writes lag is noticeable however with ZFS, and the behavior of the
transaction group writes. If you have a big write that needs to land
on disk, it seems all other I/O, CPU and "niceness" is thrown out the
window in favor of getting all that data on disk.
I was on a watch list for a ZFS I/O scheduler bug with my paid Solaris
support, I'll try to find that bug number, but I believe some
improvements were done in 129 and 130.



-- 
Brent Jones
br...@servuhome.net
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS write bursts cause short app stalls

Reply via email to