On Sep 24, 2008, at 10:03 PM, Allen Belletti wrote:

As best I can determine, the worst problems occur when certain users
with very large Inboxes (~10k messages) receive new mail and their
client looks up information about that message.  GFS doesn't seem to
efficiently handle the large directories that contain folders like
this.  As a result, lots of I/O ops are generated and performance
suffers for everyone.

I am beginning to wonder if it might be more efficient to revert to the
old mbox format, with one file per folder (plus whatever indices are
creates.)  It seems that this ought to work better with GFS which is
geared toward smaller numbers of larger files.  Is anyone on the list
currently doing that?  Alternately, any thoughts regarding tuning or
other options would be appreciated.

One possibility would be to use dbox format with hashed directories so for each mailbox it could create n directories where to store the messages. Two problems here though:

1. dbox code hasn't been tested all that much yet in real world (but it works well in my stress tests)

2. dbox doesn't yet support directory hashing, but it would be pretty easy to implement.

Attachment: PGP.sig
Description: This is a digitally signed message part

Reply via email to