I was wondering if I should add compression support to mdbox one mail at
a time or one file (~2MB) at a time. The tradeoffs are:

 * one mail at a time allows quickly seeking to wanted mail inside the
file, but it can't compress mails as well
 * one file at a time compresses better, but seeking is slow because it
can only be done by uncompressing all the data until the wanted offset
is reached

I did a quick test for this with 27 MB of my old INBOX mails:

(note the -b option, so it doesn't count wasted fs space)
mdbox/storage% du -sb .
15120350        .

Maildir/cur% du -sb .             
16517320        .

% echo 1-15120350/16517320|bc -l
.08457606924125705623

So, compressed mdboxes take 8.5% less space. This was with regular gzip
compression with default level. With bzip2 -9 compression the difference
was 10%.

Any thoughts on if 8-10% is significant enough improvement to make
seeking performance worse? Or perhaps I should just implement both
ways.. :)

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to