Hi, bring this up again to ask one more question: what would be the best recommended locking strategy for dovecot against cephfs? this is a balanced setup using independent director instances but all dovecot instances on each node share the same storage system (cephfs).
Regards, Webert Lima DevOps Engineer at MAV Tecnologia *Belo Horizonte - Brasil* *IRC NICK - WebertRLZ* On Wed, May 16, 2018 at 5:15 PM Webert de Souza Lima <webert.b...@gmail.com> wrote: > Thanks Jack. > > That's good to know. It is definitely something to consider. > In a distributed storage scenario we might build a dedicated pool for that > and tune the pool as more capacity or performance is needed. > > Regards, > > Webert Lima > DevOps Engineer at MAV Tecnologia > *Belo Horizonte - Brasil* > *IRC NICK - WebertRLZ* > > > On Wed, May 16, 2018 at 4:45 PM Jack <c...@jack.fr.eu.org> wrote: > >> On 05/16/2018 09:35 PM, Webert de Souza Lima wrote: >> > We'll soon do benchmarks of sdbox vs mdbox over cephfs with bluestore >> > backend. >> > We'll have to do some some work on how to simulate user traffic, for >> writes >> > and readings. That seems troublesome. >> I would appreciate seeing these results ! >> >> > Thanks for the plugins recommendations. I'll take the change and ask you >> > how is the SIS status? We have used it in the past and we've had some >> > problems with it. >> >> I am using it since Dec 2016 with mdbox, with no issue at all (I am >> currently using Dovecot 2.2.27-3 from Debian Stretch) >> The only config I use is mail_attachment_dir, the rest lies as default >> (mail_attachment_min_size = 128k, mail_attachment_fs = sis posix, >> ail_attachment_hash = %{sha1}) >> The backend storage is a local filesystem, and there is only one Dovecot >> instance >> >> > >> > Regards, >> > >> > Webert Lima >> > DevOps Engineer at MAV Tecnologia >> > *Belo Horizonte - Brasil* >> > *IRC NICK - WebertRLZ* >> > >> > >> > On Wed, May 16, 2018 at 4:19 PM Jack <c...@jack.fr.eu.org> wrote: >> > >> >> Hi, >> >> >> >> Many (most ?) filesystems does not store multiple files on the same >> block >> >> >> >> Thus, with sdbox, every single mail (you know, that kind of mail with >> 10 >> >> lines in it) will eat an inode, and a block (4k here) >> >> mdbox is more compact on this way >> >> >> >> Another difference: sdbox removes the message, mdbox does not : a >> single >> >> metadata update is performed, which may be packed with others if many >> >> files are deleted at once >> >> >> >> That said, I do not have experience with dovecot + cephfs, nor have >> made >> >> tests for sdbox vs mdbox >> >> >> >> However, and this is a bit out of topic, I recommend you look at the >> >> following dovecot's features (if not already done), as they are awesome >> >> and will help you a lot: >> >> - Compression (classic, https://wiki.dovecot.org/Plugins/Zlib) >> >> - Single-Instance-Storage (aka sis, aka "attachment deduplication" : >> >> https://www.dovecot.org/list/dovecot/2013-December/094276.html) >> >> >> >> Regards, >> >> On 05/16/2018 08:37 PM, Webert de Souza Lima wrote: >> >>> I'm sending this message to both dovecot and ceph-users ML so please >> >> don't >> >>> mind if something seems too obvious for you. >> >>> >> >>> Hi, >> >>> >> >>> I have a question for both dovecot and ceph lists and below I'll >> explain >> >>> what's going on. >> >>> >> >>> Regarding dbox format (https://wiki2.dovecot.org/MailboxFormat/dbox), >> >> when >> >>> using sdbox, a new file is stored for each email message. >> >>> When using mdbox, multiple messages are appended to a single file >> until >> >> it >> >>> reaches/passes the rotate limit. >> >>> >> >>> I would like to understand better how the mdbox format impacts on IO >> >>> performance. >> >>> I think it's generally expected that fewer larger file translate to >> less >> >> IO >> >>> and more troughput when compared to more small files, but how does >> >> dovecot >> >>> handle that with mdbox? >> >>> If dovecot does flush data to storage upon each and every new email is >> >>> arrived and appended to the corresponding file, would that mean that >> it >> >>> generate the same ammount of IO as it would do with one file per >> message? >> >>> Also, if using mdbox many messages will be appended to a said file >> >> before a >> >>> new file is created. That should mean that a file descriptor is kept >> open >> >>> for sometime by dovecot process. >> >>> Using cephfs as backend, how would this impact cluster performance >> >>> regarding MDS caps and inodes cached when files from thousands of >> users >> >> are >> >>> opened and appended all over? >> >>> >> >>> I would like to understand this better. >> >>> >> >>> Why? >> >>> We are a small Business Email Hosting provider with bare metal, self >> >> hosted >> >>> systems, using dovecot for servicing mailboxes and cephfs for email >> >> storage. >> >>> >> >>> We are currently working on dovecot and storage redesign to be in >> >>> production ASAP. The main objective is to serve more users with better >> >>> performance, high availability and scalability. >> >>> * high availability and load balancing is extremely important to us * >> >>> >> >>> On our current model, we're using mdbox format with dovecot, having >> >>> dovecot's INDEXes stored in a replicated pool of SSDs, and messages >> >> stored >> >>> in a replicated pool of HDDs (under a Cache Tier with a pool of SSDs). >> >>> All using cephfs / filestore backend. >> >>> >> >>> Currently there are 3 clusters running dovecot 2.2.34 and ceph Jewel >> >>> (10.2.9-4). >> >>> - ~25K users from a few thousands of domains per cluster >> >>> - ~25TB of email data per cluster >> >>> - ~70GB of dovecot INDEX [meta]data per cluster >> >>> - ~100MB of cephfs metadata per cluster >> >>> >> >>> Our goal is to build a single ceph cluster for storage that could >> expand >> >> in >> >>> capacity, be highly available and perform well enough. I know, that's >> >> what >> >>> everyone wants. >> >>> >> >>> Cephfs is an important choise because: >> >>> - there can be multiple mountpoints, thus multiple dovecot instances >> on >> >>> different hosts >> >>> - the same storage backend is used for all dovecot instances >> >>> - no need of sharding domains >> >>> - dovecot is easily load balanced (with director sticking users to >> the >> >>> same dovecot backend) >> >>> >> >>> On the upcoming upgrade we intent to: >> >>> - upgrade ceph to 12.X (Luminous) >> >>> - drop the SSD Cache Tier (because it's deprecated) >> >>> - use bluestore engine >> >>> >> >>> I was said on freenode/#dovecot that there are many cases where SDBOX >> >> would >> >>> perform better with NFS sharing. >> >>> In case of cephfs, at first, I wouldn't think that would be true >> because >> >>> more files == more generated IO, but thinking about what I said in the >> >>> beginning regarding sdbox vs mdbox that could be wrong. >> >>> >> >>> Any thoughts will be highlt appreciated. >> >>> >> >>> Regards, >> >>> >> >>> Webert Lima >> >>> DevOps Engineer at MAV Tecnologia >> >>> *Belo Horizonte - Brasil* >> >>> *IRC NICK - WebertRLZ* >> >>> >> >>> >> >>> >> >>> _______________________________________________ >> >>> ceph-users mailing list >> >>> ceph-us...@lists.ceph.com >> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >>> >> >> >> >> _______________________________________________ >> >> ceph-users mailing list >> >> ceph-us...@lists.ceph.com >> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> >> > >> >>