On Sun, Apr 06, 2003, Ira Abramov wrote about "Re: In what ways maildir is probably better then mbox?": > > For deletion your point is obviously true, but I don't think it's true for > > delivery: delivery in an old-style mailbox file involves simply appending > > a message to the end of the message - there is no need to search the big > > mailbox for anything while doing it. > > other than seeking to the very end of it, locking it, checking for > previous locks, checking for stale status of said locks, etc. etc.
Everything you describe is an O(1) operation, not something that becomes longer as the mailbox grows. The small O(1) operations you describe are not noticable in any situation except perhaps when you are serving hundreds of thousands of mailboxes. By the way, maildir implictly includes the same locks and everything you describe, it's just that they are inside the kernel (so that two processes writing to the same directory concurrently don't ruin it). > think of maildir as a hash and mbox as a linked list. it's even worse As I said, it is only the *append* operation which is fast on mbox - other operations like showing the list of messages or deleting a message in the middle of the file indeed require reading or writing the whole file which is very inefficient for large files. > since you don't know how large each message is when you traverse the > mbox so you need to read each and every byte into memory, think about > cache thrashing and trashing with modern mailboxes full of large > attachments. This is not entirely true, because modern mail messages contain a Content-Length: header allowing to skip large emails and attachments easily. The more problematic case is that of an mbox containing thousands of small messages. By the way, even with the mailbox file format, a smart mail browser could optimize the "perceived response time" by, say, showing the list of the last 500K of messages first (supposedly, the newest and most interesting to the user), and then going back to showing the rest of the messages - just like web browsers nowadays display web-pages incrementally, as they are being loaded. > and a word about "indexed mailboxes", Mozilla doesn't change a thing in > the mbox format, it just adds a persistant index file that saves the > time of indexing an mbox in RAM each time you open it. If you dump the > index you should still be able to read it in mutt for instance. some > past versions of Netscape knew how to reindex standalone mbox files when > forced to adopt one, I have no idea if that option still exists. Another possible optimization (that I don't know if Mozilla did or did not) is that there is no need to rewrite the mbox every time you delete a few messages - it is enough to mark messages deleted, and do the "gap closing" and reindexing later - perhaps even as a cron job when the user isn't waiting on this folder. -- Nadav Har'El | Sunday, Apr 6 2003, 4 Nisan 5763 [EMAIL PROTECTED] |----------------------------------------- Phone: +972-53-245868, ICQ 13349191 |This box was intentionally left blank. http://nadav.harel.org.il | ================================================================= To unsubscribe, send mail to [EMAIL PROTECTED] with the word "unsubscribe" in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]