On Fri, Dec 07, 2007 at 12:58:17 +0000, Darren J Moffat wrote:
: Dickon Hood wrote:
: >On Fri, Dec 07, 2007 at 12:38:11 +0000, Darren J Moffat wrote:
: >: Dickon Hood wrote:
: >: >We've got an interesting application which involves recieving lots of
: >: >multicast groups, and writing the data to disc as a cache.  We're
: >: >currently using ZFS for this cache, as we're potentially dealing with a
: >: >couple of TB at a time.

: >: >The threads writing to the filesystem have real-time SCHED_FIFO 
: >priorities
: >: >set to 25.  The processes recovering data from the cache and moving it
: >: >elsewhere are niced at +10.

: >: >We're seeing the writes stall in favour of the reads.  For normal
: >: >workloads I can understand the reasons, but I was under the impression
: >: >that real-time processes essentially trump all others, and I'm surprised
: >: >by this behaviour; I had a dozen or so RT-processes sat waiting for disc
: >: >for about 20s.

: >: Are the files opened with O_DSYNC or does the application call fsync ?

: >No.  O_WRONLY|O_CREAT|O_LARGEFILE|O_APPEND.  Would that help?

: Don't know if it will help, but it will be different :-).  I suspected 
: that since you put the processes in the RT class you would also be doing 
: synchronous writes.

Right.  I'll let you know on Monday; I'll need to restart it in the
morning.

I put the processes in the RT class as without they dropped packets once
in a while, especially on lesser hardware (a Netra T1 can't cope without,
a Niagara usually can...).  Very odd.

: If you can test this it may be worth doing so for the sake of gathering 
: another data point.

Noted.  I suspect (from reading the man pages) it won't make much
difference, as to my mind it looks like a scheduling issue.  Just for
interest's sake: when everything is behaving normally when writing only,
'zpool iostat 10' looks like:

               capacity     operations    bandwidth
pool         used  avail   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
content     56.9G  2.66T      0    118      0  9.64M

normally, whilst reading and writing it looks like:

content     69.8G  2.65T    435    103  54.3M  9.63M

and when everything breaks, it looks like:

content      119G  2.60T    564      0  66.3M      0

prstat usually shows processes idling, a priority 125 for a moment, and
other behaviour that I'd expect.  When it all breaks, I get most of them
sat at priority 125 thumbtwiddling.

Perplexing.

-- 
Dickon Hood

Due to digital rights management, my .sig is temporarily unavailable.
Normal service will be resumed as soon as possible.  We apologise for the
inconvenience in the meantime.

No virus was found in this outgoing message as I didn't bother looking.
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to