On 11/11/2012 5:26 PM, Christoph Anton Mitterer wrote:
> Have you made systematic tests? I.e. compared times for all of these
> with those from the different dovecot backends.

The choice of Dovecot backends made no substantial difference.  I used maildir, 
sdbox, and mdbox.  I also added SiS (with mdbox).  Initial tests were on local 
multi-spindle RAID5 storage, but to handicap Dovecot, I pushed it over NFS 
(also Linux 3.2 on a local GigE segment).  It wasn't slow enough to make dbmail 
competitive, even though you have to start turning off performance optimisation 
features in Dovecot to avoid NFS bugs.

>> There wasn't a task that the dbmail setup performed faster than
>> Dovecot, in either low or high load situations.
> Which backend did you use?

Backend for dbmail?  Two MySQL versions (5.0 and 5.5) - InnoDB is required for 
dbmail, by the way.  Postgres 8.4 and 9.1 backends, using its default storage 
engine.  I tried the tests with both a separate DB machine, as well as a 
cohosted one with the dbmail connector using local sockets instead of TCP/IP, 
but that didn't significantly alter the performance.

I've found my first notes from the tests.  It was the second round of tests 
with the latest MySQL 5.0 server given some tuning to more aggressively use 
system memory.  You will note the puny size of the mail folder hive in this 
round.

> The mysqld process has consumed nearly an hour of CPU time during this 
> process.
> dbmail is configured to use local sockets rather than network I/O.
> 
> I'm using the PERL MailTools http://search.cpan.org/dist/MailTools/
> to import about 10 folders' worth of email, totaling about 560MB in raw size, 
> constituting about 23,000 emails.  The script basically creates the folders, 
> and does an APPEND for each email.  It's bog simple.
> 
> I DROP the database, recreated it, added the one user, verify DBMail 
> accepts authentication for the newly created mailbox, and then do the import.
> The MySQL files live on a freshly formatted ext4 filesystem.
> 
> The import takes Dovecot (MailDir or mdbox format), or Panda IMAP (mix) 
> about six minutes to complete.
> 
> DBMail 3 took 4h 23m.  Casual inspection of the system showed modestly 
> high CPU usage in mysqld and dbmail-imapd (as well as the import perl 
> command on occasion), but the Load Average didn't get too close to 1.0,
> let alone 2.0, which concerns me that I might have hit some kind of 
> "busy wait" pathology. 

To clarify the above:  To streamline iterative testing, I made a script to 
deactivate the currently running SQL server, unmount, re-format, re-mount, and 
re-populate the skeletal DB directories and restart the DB engine.  So between 
each test, no matter the imapd or DB back-end, the mailstore was presented with 
a freshly formatted volume on dedicated spindles.  The filesystem was ext4, 
formatted with:

lazy_itable_init=0,lazy_journal_init=0,dir_index=1,extents=1,uninit_bg=0,flex_bg=0,has_journal=0,inode_size=256,dir_index=1,

> Do you have detailed numbers?

Not really, but after it was clear that I wasn't going to get comparable 
performance even within the same magnitude, I stopped testing it.  I included 
the IMAP SEARCH performance comparison against fts_squat in my original mail to 
this list.  In addition to huge performance deficiencies, it also has/had fatal 
operational bugs.

> I guess you’ve "only" tried dbmail?

I did try Manitou, but the lack of a proper IMAP service for it made extensive 
"like for like" testing very difficult.  Manitou is still in the very early 
days, alas.  It also relies on the SQL DB's underlying authentication systems 
which is rather ... alarming.  It performs quite a bit better than dbmail, but 
still it's not close to Dovecot.  At the time I tested it, only custom-rolled 
clients could talk to it, i.e., no imap4/pop3 "gateways" to it.

I think I was most alarmed to see that the widely assumed benefits of putting 
mail on a SQL DB, i.e., fast searching/sorting, didn't actually happen in 
reality.

As others have mentioned, I also shudder to think of backup/restore issues, 
especially on a single user level.  The mechanisms of backing up and restoring 
maildirs and even mdboxes, i.e., simple files, are not only well understood, 
the failure modes are generally fully recoverable.  SQL-DB file blobs, 
especially with MySQL, remind me too much of the "PST Hell" that Exchange 
administrators face.  But maybe that's just my ignorance talking.

> All something I wouldn’t want to do on my production systems ;)

Neither would I.  But as I said, I was "desperate" to get this close to 
Dovecot's performance.  I had about 2-3 weeks to pre-qualify mail storage 
back-ends with an eye towards 4 or 5 digits of usercount, and maybe tens to 
hundreds of TBs' scale of mail storage.  Running across such poor performance 
with such relatively small loads disqualified the DB-based mail products very 
very quickly, for ME, anyway.

If you want to run your own tests, my suggestion is to start with Postgres, put 
as much RAM into your DB machine as you can afford, and maybe populate your DB 
machine exclusively with SSDs.

=R=

Reply via email to