On 12/19/2014 06:17 AM, Patrick Goetz wrote:
Nic,

Thanks for that detailed explanation.  I still feel myself somewhat
stymied by either the documentation (or lack thereof) or perhaps an
unfortunate case of being somewhat feeble-minded.  Here are some follow
up comments/questions:


On 12/18/2014 9:59 AM, Nic Bernstein wrote:
I will say that the ability to quiesce the application without halting
it would be most desirable.  Most databases have supported this sort of
thing for ages, and it would be great if one could send a signal to
Cyrus to achieve the same result.
I wonder what would happen if you just stopped lmtp while making a
snapshot?  Would postfix choke on this and start kicking messages back
to the sender, or would they get queued for later delivery?
Alternatively, maybe lmtp could temporarily divert new messages to a
dummy spool so that postfix/sendmail wouldn't have to know anything
about this.  This might be the least painful way to implement quiescence
in cyrus.

But LMTP is only one method affecting the mail store, IMAP and sieve can as well. Granted one can brute-force this by shutting down network ports and the like, but at that point why not just stop cyrus?

  > His initial suggestion -- stop cyrus, snapshot, restart cyrus -- is
  > reasonable, but we feel that the later suggestion -- stop cyrus, tar
  > up data, start cyrus -- is not.  It takes data offline for too long.
  > That's why the snapshot capability is necessary in any truly suitable
  > server.

I agree.  Here is a substitute proposal (and I'll come back to why I'm
pushing this point).  Serially

    1. rsync user mail files
    2. rsync configdirectory db files
    3. rsync user mail files again

That should get you reasonably close to what you get with snapshots.

No, not in the least is this close to a snapshot. Snapshots are instantaneous, or near to it. The time an rsync takes, even a catch-up, grows with the size of the mail store and the deltas between attempts. Also, rsync is not well suited to the file-per-message, directory-per-mailbox storage scheme of cyrus, as lots of fstats() result, and this just adds to the time.

I don't understand why one wouldn't use snapshots? Every modern OS and distro include filesystems or volume managers which support snapshotting, and several, such as Ubuntu, even recommend snapshot-capable partitioning schemes out of the box. It's just not that hard, and it's exactly the right way to handle this sort of staged backup.

 * Halt cyrus
 * snapshot critical filesystems
     o spool date (/var/spool/imap)
     o config data (/var/lib/imap or /var/imap)
     o metadata (i.e. /var/run/cyrus)
 * start cyrus
 * mount snapshot
 * rsync or otherwise backup from snapshot
 * unmount snapshot
 * (optionally) destroy snapshot

This is so easy to handle via a cron or at job. Why would one do this? If the answer is "legacy system," then fine, but legacies can be upgraded or replaced.

If you follow the prescribed cyrus directory structure, then this can be
simplifed (Arch linux example):

    1. rsync -a --delete /var/imap/user [removable disk/other server]
    2. rsync -a --delete /var/imap   [removable disk/other server]

Once you've rsynced the mail files once, rsyncing them again a short
time later should be pretty fast.  There does need to be a backup
solution for people who only have one server, hence can't use
replication or imapsync to do backups.

There is, snapshots, or hosted mail services (like Fastmail :).

Lastly, as to the use of imapsync to achieve user, mailbox or server
replication,...

So your command line is much like Patrick's example, but with '--user1
<user> --authuser1 <proxyuser> --user2 <user>...'
Of course you must create a proxy user, and Cyrus supports this with the
'proxyserver' directive in imapd.conf (man imapd.conf for details),
i.e.: 'proxyservers:    proxyuser'.
Here is the imapd.conf man page entry for proxyservers:

    proxyservers: <none>
      A list of users and groups that are allowed to proxy for other
      users, separated by spaces. Any user listed in this will be
      allowed to login for any other user: use with caution. In a
      standard murder this option should ONLY be set on backends.
      DO NOT SET on frontends or things won't work properly.

That capitalized "DO NOT SET on frontends" would seem to be cause for
concern, especially since I don't understand how this works.

Well then, get thee to a website or man page. :-)
    http://cyrusimap.web.cmu.edu/docs/cyrus-imapd/2.4.17/ag.php

No, seriously, this isn't an issue if you're not using a murder. A "frontend" is the part of a murder aggregation cluster which proxies for the backend servers which actually hold the mail store. A murder consists of one or more frontends, one or more backends and a single "mupdate" master, which controls the canonical copy of the mailboxes database. In a murder, if one wants to set the proxyservers option, one sets it only on the backend machines.

The proxyservers option is exactly the right way to do this.

For people who are
   1. imapsync'ing between machines both behind a firewall
   2. using saslauthd with pam

I thought of this solution:  Temporarily block port 143 traffic on the
outward facing port of your firewall, and then add the line

    auth  sufficient  pam_permit.so

to the top of /etc/pam.d/imap files on both the sending and receiving
imap servers.  This should allow you to imapsync the mail stores for
every user without having to provide passwords.  Once you're done,
simply remove these lines from the PAM configuration files and unblock
the port on the firewall.  Yes, this will mean that users won't be able
to access their mail from outside the firewall while the imapsync is in
operation, and this is probably only workable for smaller organizations
where people are not concerned about their coworkers temporarily being
able to access their mail.  There could probably be a desktop policy to
handle this as well.

Ouch, that seems a lot harder to me than setting proxyservers.

However, you are 100% correct that replication would appear to be a far
less complex solution.  After reading through the available
documentation, it wasn't clear to me that it was possible to do
replication without setting up a murder, a complexity I was hoping to avoid.

So, here's the feeble-mindedness component:  I didn't completely follow
your explanation for setting up a replication server.  It would be
awesome to have a howto for doing this -- is anyone aware of anything
like this; i.e. howto set up a replication server outside the murder
context.

Then please take a look at the replication page on the Project Cyrus website:
http://cyrusimap.org/docs/cyrus-imapd/2.4.17/install-replication.php

Here's my earlier example with the murder components stripped out, and some commenting added:

Both servers (note last entry):

/etc/services

   lmtp         24/tcp
   imap2                143/tcp
   imap2                143/udp
   imaps                993/tcp
   imaps                993/udp
   sieve                4190/tcp
   *csync               2005/tcp*

Master server:

/etc/imapd.conf

   ...
   ##
   # These configuration parameters are for the master server
   # in a replication set

   # The list of userids with administrative rights
   admins: cyrus

   ##
   # Replication support
   # This is how the BACKEND for this host is defined
   sync_host: replica.example.com
   sync_authname: mailproxy
   sync_password: <password>
   sync_realm: <if required for your auth scheme>

   # Whether to compress the replication stream, important if using WAN links
   sync_compress: true

   # To enable "rolling" replication, set this to TRUE
   # This causes all data altering daemons, such as imapd, lmtpd, etc. to log 
their
   # actions for replication.
   sync_log: true

   # Minimum interval (in seconds) between replication runs in rolling 
replication mode.
   sync_repeat_interval: 5

   # A file whose existence will cause the sync_client to stop at its next 
opportunity
   sync_shutdown_file: /var/run/cyrus/sync_stop
   ...

/etc/cyrus.conf

   ...
   SERVICES {
        ...
        syncclient              cmd="/usr/lib/cyrus/bin/sync_client -r"
        ...

Replica server:

## /etc/imapd.conf

   ...
   ##
   # These configuration parameters are for the replica server in a
   # replication cluster

   # The list of userids with administrative rights
   # For a replica, this must include the user with which the master
   # will authenticate
   admins: cyrus mailproxy

   ##
   # Unless you're using TLS between master and replica, add this
   force_sasl_client_mech: PLAIN
   master_mechs: PLAIN

## /etc/cyrus.conf

   ...
   SERVICES {
        ...
        syncserver       cmd="/usr/lib/cyrus/bin/sync_server" listen="csync"
        ...

Here's some extra notes:

 * The webpage listed above on replication explains rolling replication
   (think "log shipping" from the DB world) as well as manual
   replication.  Check that out.
 * We find that it doesn't hurt to use both rolling and periodic
   replication, and have cron handle the latter
 * If the master stops listening for csync traffic, when halted for a
   snapshot, for example, then the sync_server process on the replica
   will die.  So, we use a nanny cronjob to make sure that one gets
   started if none are running.

Here's our crontabs for master and replica:

Master:

   ### Ensure replication is up to date
   30 5 * * * /usr/local/sbin/cyrus_user_sync.pl >/dev/null 2>&1
   ##
   ### Run quota check script
   30 6 * * * /usr/local/sbin/quota-report >/dev/null 2>&1
   ##
   ### Update mailbox annotations
   45 6 * * * /usr/local/sbin/set_cyrus_annotations.sh >/dev/null 2>&1
   ##
   ### Update quotas
   */5 * * * * /usr/local/sbin/cyrus_ldap_quota.pl >/dev/null 2>&1

Replica:

   ##
   # ensure that the sync_client keeps running.  Comment this out
   # following promotion from replica to master.
   @hourly      /usr/local/sbin/sync_nanny.sh >/dev/null

We'll be happy to share these scripts with anyone who'd care to have a copy, but they might be specific to our use of LDAP to manage account details. The idea of each, however, is to leverage the account DB, which in our case is almost always LDAP, to maintain, update or alter the cyrus account information.

However, I must be honest and point out that if you're going to go to
the trouble of figuring out how to use imapsync (and possibly pay for
it, to boot) you may as well just set up a replica.  As I've shown,
above, it's just not that hard.
Imapsync is still useful for migrating individual users from one imap
server to another.  In my case, I'm migrating from a cyrus 2.3.x server
using Berkeley db metadata files to a cyrus 2.4.x server which will be
entirely skiplist based.  Understood that you can convert db files to
skiplists, but I feel most comfortable using imapsync for this.  In this
use case there are only a handful of users, but they all have extremely
complex and massive mail folders.

My current plan is to use imapsync for the migration and then
replication to another dummy server for backup, assuming I can figure
out how to set up replication.

I strongly recommend against this course of action. If you're migrating between two boxes, which it sounds like you are, then you're much better off rsyncing the spool data between them (once you've halted cyrus) and then allowing cyrus to perform the necessary DB updates.

Check the Install-Upgrades page for anything else which changes between your versions of cyrus. Since you didn't specify which 2.3.x or 2.4.x you're using, I can't tell you what you'll need, but you'll find that info in doc/install-upgrade.html of your version. If you're installing from packages this may not be included, so do yourself a favor and download a copy for reference.

As the upgrade guide states (emphasis added):

   The default type for all databases is now skiplist which is very
   reliable now, all the bugs are ironed out! *Because ctl_cyrusdb -r
   automatically converts databases between known types, you shouldn't
   need to do anything*, but if you want to keep the old defaults,
   you'll need to make them explicit in your imapd.conf as follows:

   duplicate_db: berkeley-nosync
   ptscache_db: berkeley
   statuscache_db: berkeley-nosync
   tlscache_db: berkeley-nosync

You have said you want skiplist, so you needn't add those settings, just make sure you remove any that exist if you copy your old imapd.conf file over.

If you prefer to manually convert the DB files, you can do this with the supplied cvt_cyrusdb tool:

   $ /usr/lib/cyrus/bin/cvt_cyrusdb /tmp/annotations.db berkeley 
/var/lib/imap/annotations.db skiplist

   or for Ubuntu
   $ cyrus cvt_cyrusdb /tmp/annotations.db berkeley 
/var/lib/imap/annotations.db skiplist

Note that in this case, you should NOT rsync the DB files into the new server's /var/lib/imap (or whatever your config directory is) but rather into a holding area, like /tmp, from which you can read them for the DB conversion.

Also, make sure you do all of this as the cyrus user, or you'll end up with permissions problems.

Good luck!
    -nic


Thanks again for your helpful comments!

----
Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

--
Nic Bernstein                             n...@onlight.com
Onlight, Inc.                             www.onlight.com
219 N. Milwaukee St., Suite 2a            v. 414.272.4477
Milwaukee, Wisconsin  53202

<<attachment: nic.vcf>>

----
Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

Reply via email to