> On Feb 22, 2017, at 2:44 PM, Timo Sirainen <t...@iki.fi> wrote:
> 
> I guess mainly the message sequence numbers in IMAP protocol makes this more 
> difficult, but it's not an impossible problem to solve.

Any thoughts on the wisdom of supporting an external database for session state 
or even mailbox state (like using Redis or even MySQL)?

Also, would it help reliability or scalability to store a copy of the index 
data in an external database?

I want to use mdbox format but I have heard that these index files do get 
corrupted occasionally and have to be rebuilt (possibly using an older version 
of the index file to construct a new one). I worry that using mdbox might cause 
my users to see the IMAP flags suddenly reset back to a previous state (like 
seeing previously read messages becoming unread in their mail clients).

If a copy of the index data were stored in an external database, such problems 
of duplicate messages occurring in a dovecot cluster could be handled by having 
the cluster “lookup” the index data using the external database instead of the 
local copy stored on the server. An external database could easily implement 
unique serial numbers cluster-wide. In the site I’m working on building, I even 
use Redis to implement “message queues” between Postfix and Dovecot (via redis 
push/pop feature). Currently, I am only delivering new messages via IMAP 
instead of LMTP (no LMTP will be available to my backend mail servers, only 
IMAP).

If you stored the MD5 checksum of the index files (and even the message files) 
in the external database, you could also run a background process that would 
periodically check for corruption of the local index files using the checksums 
from the database, making mdbox format even more bulletproof.

And, the best thing about using an external database is that making the 
external database highly available is not a problem (as most sites already do 
that). The index data stored in the database would become the “source of truth” 
with the local index files/session data being an efficient cache for the 
mailstore. And, re-caching could occur as needed to make the whole cluster more 
reliable.

Kevin

Reply via email to