On 22 Sep 2017, at 14.18, mj <li...@merit.unu.edu> wrote:
> First, the Github link:
> https://github.com/ceph-dovecot/dovecot-ceph-plugin
> 
> I am not going to repeat everything which is on Github, put a short summary:
> 
> - CephFS is used for storing Mailbox Indexes
> - E-Mails are stored directly as RADOS objects
> - It's a Dovecot plugin
> 
> We would like everybody to test librmb and report back issues on Github so 
> that further development can be done.
> 
> It's not finalized yet, but all the help is welcome to make librmb the best 
> solution for storing your e-mails on Ceph with Dovecot.

It would be have been nicer if RADOS support was implemented as lib-fs driver, 
and the fs-API had been used all over the place elsewhere. So 1) 
LibRadosMailBox wouldn't have been relying so much on RADOS specifically and 2) 
fs-rados could have been used for other purposes. There are already fs-dict and 
dict-fs drivers, so the RADOS dict driver may not have been necessary to 
implement if fs-rados was implemented instead (although I didn't check it 
closely enough to verify). (We've had fs-rados on our TODO list for a while 
also.)

BTW. We've also been planning on open sourcing some of the obox pieces, mainly 
fs-drivers (e.g. fs-s3). The obox format maybe too, but without the "metacache" 
piece. The current obox code is a bit too much married into the metacache 
though to make open sourcing it easy. (The metacache is about storing the 
Dovecot index files in object storage and efficiently caching them on local 
filesystem, which isn't planned to be open sourced in near future. That's 
pretty much the only difficult piece of the obox plugin, with Cassandra 
integration coming as a good second. I wish there had been a better/easier 
geo-distributed key-value database to use - tombstones are annoyingly 
troublesome.)

And using rmb-mailbox format, my main worries would be:
 * doesn't store index files (= message flags) - not necessarily a problem, as 
long as you don't want geo-replication
 * index corruption means rebuilding them, which means rescanning list of mail 
files, which means rescanning the whole RADOS namespace, which practically 
means  rescanning the RADOS pool. That most likely is a very very slow 
operation, which you want to avoid unless it's absolutely necessary. Need to be 
very careful to avoid that happening, and in general to avoid losing mails in 
case of crashes or other bugs.
 * I think copying/moving mails physically copies the full data on disk
 * Each IMAP/POP3/LMTP/etc process connects to RADOS separately from each 
others - some connection pooling would likely help here

Reply via email to