Re: [zfs-discuss] ZFS write bursts cause short app stalls

Roch Bourbonnais Mon, 28 Dec 2009 09:21:20 -0800


Le 28 déc. 09 à 00:59, Tim Cook a écrit :

On Sun, Dec 27, 2009 at 1:38 PM, Roch Bourbonnais <roch.bourbonn...@sun.com> wrote:
Le 26 déc. 09 à 04:47, Tim Cook a écrit :
On Fri, Dec 25, 2009 at 11:57 AM, Saso Kiselkov<skisel...@gmail.com> wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I've started porting a video streaming application to opensolaris on
ZFS, and am hitting some pretty weird performance issues. The thingI'm
trying to do is run 77 concurrent video capture processes (roughly
430Mbit/s in total) all writing into separate files on a 12TB J4200
storage array. The disks in the array are arranged into a singleRAID-0
ZFS volume (though I've tried different RAID levels, none helped). CPU
performance is not an issue (barely hitting 35% utilization on asingle
CPU quad-core X2250). I/O bottlenecks can also be ruled out, since the
storage array's sequential write performance is around 600MB/s.

The problem is the bursty behavior of ZFS writes. All the capture
processes do, in essence is poll() on a socket and then read() and
write() any available data from it to a file. The poll() call is done
with a timeout of 250ms, expecting that if no data arrives within 0.25
seconds, the input is dead and recording stops (I tried increasingthis
value, but the problem still arises, although not as frequently). When
ZFS decides that it wants to commit a transaction group to disk (every
30 seconds), the system stalls for a short amount of time anddepending
on the number capture of processes currently running, the poll() call
(which usually blocks for 1-2ms), takes on the order of hundreds ofms,sometimes even longer. I figured that I might be able to resolvethis by
lowering the txg timeout to something like 1-2 seconds (I need ZFS to
write as soon as data arrives, since it will likely never be
overwritten), but I couldn't find any tunable parameter for itanywhere
on the net. On FreeBSD, I think this can be done via the
vfs.zfs.txg_timeout sysctl. A glimpse into the source at
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/zfs/txg.c
on line 40 made me worry that somebody maybe hard-coded this valueinto
the kernel, in which case I'd be pretty much screwed in opensolaris.

Any help would be greatly appreciated.

Regards,
- --
Saso
Hang on... if you've got 77 concurrent threads going, I don't seehow that's a "sequential" I/O load. To the backend storage it'sgoing to look like the equivalent of random I/O.
I see this posted once in a while and I'm not sure where that comesfrom. Sequential workloads are important inasmuch as the FS/VM candetect and issue large request to disk (followed by cache hits)instead of multiple small ones. The detection for ZFS is done atthe file level and so the fact that one has N concurrent streamsgoing is not relevant.On writes ZFS and the Copy-On-Write model makes sequential/randomdistinction not very defining. All writes are targetting free blocks.
-r
That is ONLY true when there's significant free space available/afresh pool. Once those files have been deleted and the blocks putback into the free pool, they're no longer "sequential" on disk,they're all over the disk. So it makes a VERY big difference. I'mnot sure why you'd be shocked someone would bring this up.

So on writes the performance is more defined by the availability ofsequential blocks rather than the application write access pattern orconcurrency of.On reads, multiple concurrent sequential streams are sequential to thefilesystem independant of the number of streams leading to someoptimisation at that level.The on-disk I/O pattern is governed by the layout and again theconcurrency of streams does not come into play when trying tounderstand the performance.

So IMO, having N file sequential access pattern does not imply thatperformance will governed by a random I/O response from disks.

-r

--
--Tim


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS write bursts cause short app stalls

Reply via email to