Le 15 févr. 08 à 11:38, Philip Beevers a écrit :

> Hi everyone,
>
> This is my first post to zfs-discuss, so be gentle with me :-)
>
> I've been doing some testing with ZFS - in particular, in  
> checkpointing
> the large, proprietary in-memory database which is a key part of the
> application I work on. In doing this I've found what seems to be some
> fairly unhelpful write throttling behaviour from ZFS.
>
> In summary, the environment is:
>
> * An x4600 with 8 CPUs and 128GBytes of memory
> * A 50GByte in-memory database
> * A big, fast disk array (a 6140 with a LUN comprised of 4 SATA  
> drives)
> * Running Solaris 10 update 4 (problems initially seen on U3 so I  
> got it
> patched)
>
> The problems happen when I checkpoint the database, which involves
> putting that database on disk as quickly as possible, using the  
> write(2)
> system call.
>
> The first time the checkpoint is run, it's quick - about 160MBytes/ 
> sec,
> even though the disk array is only sustaining 80MBytes/sec. So we're
> dirtying stuff in the ARC (and growing the ARC) at a pretty impressive
> rate.
>
> After letting the IO subside, running the checkpoint again results in
> very different behaviour. It starts running very quickly, again at
> 160MByte/sec (with the underlying device doing 80MBytes/sec), and  
> after
> a while (presumably once the ARC is full) things go badly wrong. In
> particular, a write(2) system call hangs for 6-7 minutes, apparently
> until all the outstanding IO is done. Any reads from that device also
> take a huge amount of time, making the box very unresponsive.
>
> Obviously this isn't good behaviour, but it's particularly unfortunate
> given that this checkpoint is stuff that I don't want to retain in any
> kind of cache anyway - in fact, preferably I wouldn't pollute the ARC
> with it in the first place. But it seems directio(3C) doesn't work  
> with
> ZFS (unsurprisingly as I guess this is implemented in segmap), and
> madvise(..., MADV_DONTNEED) doesn't drop data from the ARC (again, I
> guess, as it's working on segmap/segvn).
>
> Of course, limiting the ARC size to something fairly small makes it
> behave much better. But this isn't really the answer.
>
> I also tried using O_DSYNC, which stops the pathological behaviour but
> makes things pretty slow - I only get a maximum of about 20MBytes/sec,
> which is obviously much less than the hardware can sustain.
>
> It sounds like we could do with different write throttling behaviour  
> to
> head this sort of thing off. Of course, the ideal would be to have  
> some
> way of telling ZFS not to bother keeping pages in the ARC.
>
> The latter appears to be bug 6429855. But the underlying behaviour
> doesn't really seem desirable; are there plans afoot to do any work on
> ZFS write throttling to address this kind of thing?
>

Throttling is being addressed.

        http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6429205


BTW, the new code will adjust write speed to disk speed very quickly.
You will not see those ultra fast initial checkpoints. Is this a  
concern ?

-r


> Regards,
>
> -- 
>
> Philip Beevers
> Fidessa Infrastructure Development
>
> mailto:[EMAIL PROTECTED]
> phone: +44 1483 206571
>
> ********************************************************************************************************************************************************************************************
> This message is intended only for the stated addressee(s) and may be  
> confidential.  Access to this email by anyone else is unauthorised.  
> Any opinions expressed in this email do not necessarily reflect the  
> opinions of Fidessa. Any unauthorised disclosure, use or  
> dissemination, either whole or in part is prohibited. If you are not  
> the intended recipient of this message, please notify the sender  
> immediately.
>
> Fidessa plc - Registered office:
> Dukes Court, Duke Street, Woking, Surrey, GU21 5BH, United Kingdom
> Registered in England no. 3781700 VAT registration no. 688 9008 78
>
> Fidessa group plc - Registered Office:
> Dukes Court, Duke Street, Woking, Surrey, GU21 5BH, United Kingdom
> Registered in England no. 3234176 VAT registration no. 688 9008 78
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to