Thanks MIchael,
Useful stuff to try. I wish we could add more memory, but the x4500
is limited to 16GB. Compression was a question. Its currently off,
but they were thinking of turning it on.
bill
On Dec 15, 2009, at 7:02 PM, Michael Herf wrote:
I have also had slow scrubbing on filesystems with lots of files,
and I agree that it does seem to degrade badly. For me, it seemed to
go from 24 hours to 72 hours in a matter of a few weeks.
I did these things on a pool in-place, which helped a lot (no
rebuilding):
1. reduced number of snapshots (auto snapshots can generate a lot of
files).
2. disabled compression and rebuilt affected datasets (is
compression on?)
3. upgraded to b129, which has metadata prefetch for scrub, seems to
help by ~2x?
4. tar'd up some extremely large folders
5. added 50% more RAM.
6. turned off atime
My scrubs went from 80 hours to 12 with these changes. (4TB used,
~10M files + 10 snapshots each.)
I haven't figured out if "disable compression" vs. "fewer snapshots/
files and more RAM" made a bigger difference. I'm assuming that once
the number of files exceeds ARC, you get dramatically lower
performance, and maybe that compression has some additional
overhead, but I don't know, this is just what worked.
It would be nice to have a benchmark set for features like this &
general recommendations for RAM/ARC size, based on number of files,
etc. How does ARC usage scale with snapshots? Scrub on a huge
maildir machine seems like it would make a nice benchmark.
I used "zdb -d pool" to figure out which filesystems had a lot of
objects, and figured out places to trim based on that.
mike
On Tue, Dec 15, 2009 at 6:41 PM, Bob Friesenhahn <bfrie...@simple.dallas.tx.us
> wrote:
On Tue, 15 Dec 2009, Bill Sprouse wrote:
Hi Everyone,
I hope this is the right forum for this question. A customer is
using a Thumper as an NFS file server to provide the mail store for
multiple email servers (Dovecot). They find that when a zpool is
freshly created and
It seems that Dovecot's speed optimizations for mbox format are
specially designed to break zfs
"http://wiki.dovecot.org/MailboxFormat/mbox#Dovecot.27s_Speed_Optimizations
"
and explains why using a tiny 8k recordsize temporarily "improved"
performance. Tiny updates seem to be abnormal for a mail server.
The many tiny updates combined with zfs COW conspire to spread the
data around the disk, requiring a seek for each 8k of data. If more
data was written at once, and much larger blocks were used, then the
filesystem would continue to perform much better, although perhaps
less well initially. If the system has sufficient RAM, or a large
enough L2ARC, then Dovecot's optimizations to diminish reads become
meaningless.
Is this expected behavior given the application (email - small,
random writes/reads)? Are there recommendations for system/ZFS/NFS
configurations to improve this sort of thing? Are there best
practices for structuring backups to avoid a directory walk?
Zfs works best when whole files are re-written rather than updated
in place as Dovecot seems to want to do. Either the user mailboxes
should be re-written entirely when they are "expunged" or else a
different mail storage format which writes entire files, or much
larger records, should be used.
Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss