I was wondering if I should add compression support to mdbox one mail at a time or one file (~2MB) at a time. The tradeoffs are:
* one mail at a time allows quickly seeking to wanted mail inside the file, but it can't compress mails as well * one file at a time compresses better, but seeking is slow because it can only be done by uncompressing all the data until the wanted offset is reached I did a quick test for this with 27 MB of my old INBOX mails: (note the -b option, so it doesn't count wasted fs space) mdbox/storage% du -sb . 15120350 . Maildir/cur% du -sb . 16517320 . % echo 1-15120350/16517320|bc -l .08457606924125705623 So, compressed mdboxes take 8.5% less space. This was with regular gzip compression with default level. With bzip2 -9 compression the difference was 10%. Any thoughts on if 8-10% is significant enough improvement to make seeking performance worse? Or perhaps I should just implement both ways.. :)
signature.asc
Description: This is a digitally signed message part