Hi --

On 04.03.2012 11:44, Timo Sirainen wrote:

In dovecot-2.1 hg you can now test dsync-based replication.
Everything isn't finished yet, but it appears to work and I've enabled
it for my @dovecot.fi mails.

I did give it a try starting some days ago, and I can confirm that you are right,
dsync replication can be used, but there are some issues, see below.


Let me start with replicator's configuration ...

Below is a configuration for virtual user setup.
[...]
service doveadm {
  # if you're using a single virtual user, set this to
  # start ssh as vmail (not root)
  user = vmail
}

... that led to the following complaints at start-up:

| dovecot: master: Dovecot v2.1.1 (d66568d34e40) starting up
| dovecot: doveadm: Error: Error reading configuration: net_connect_unix(/var/run/dovecot/config) failed: Permission denied
| [...]
| (repeatedly, presumably for the number of users in userdb?)

Therefore, I modified dsync_remote_cmd ...

dsync_remote_cmd = ssh -p 1234 -l vmail %{host} doveadm dsync-server -u%u -l%{lock_timeout} -n%{namespace}

... and used an empty 'service doveadm { }' instead. That worked, but I would love to run doveadm as vmail user (security), though. How should I do that without
running into the error messages above?



Now some observations regarding replicator:

1) I see a lot of error messages whenever replicator is in action
   like (although everything is being synced correctly):

| <mail.err> mail dovecot: dsync-local(test): Error: remote: dsync-remote(test): Info: save: box=INBOX, uid=27, msgid=<3v2jfh5kv4z...@example.tld>, size=547, from=t...@example.tld (admin), flags=()

| <mail.info> mail dovecot: dsync-local(test): Error: remote: dsync-remote(test): Info: flag_change: box=TEST, uid=27568, msgid=<20120307144810.6360a74f...@example.tld>, size=435, from=t...@example.tld, flags=(\Seen)

   JFTR: I do have mail_log plugin activated.


Some testing results:

1) I ran a test by sending locally produced mails every other minute on both servers simultaneously. That test ran for ~5 hours. All mails became synced correctly, and
   no losses were observable, but some duplicates.

2) I did send 100 small test mails from a distant server to my mailservers (mx1 and mx2):

a) replicator and dsync deactivated: received 100 distinct mails (57 at mx1, 43 at mx2). b) now, replicator active: 172 mails (100 distinct, a lot of duplicates (up to 8
      incarnations of the very same mail).

Ok, 2b) is a rather 'mailbomb-like' scenario, but it worries me a bit: One of my users is receiving mails from a mailing list that sends individual mails batch-wise ...

3) replicator active: 1000 mails sent ended in 4523 mails at every server. Well, that was
   a mailbomb :-)

4) replicator active: 100 (and even 1000) locally produced mails at one server only: all 100 (and 1000 mails) became synced, prefectly well, without duplicates.

5) replicator active: 100 locally produced mails at both servers simultaneously: 341 mails,
   thus a lot of multiple incarnations.
(This test differed from 1) because all mails were sent in one batch.)

Final note to these tests: It doesn't matter whether sieve with redirecting, or sieve with
redirecting and copying, or no sieve at all has been involved.

It seems to me, that whenever a larger number of mails arrive on both servers simultaneously, the replicator gets into trouble [1]. I am unsure if one can expect that a replicator should
deal with such stress, though. Or?


Résumé: The overall performance of replicator is very good from my point of view for my conditions (handful users, average workload of roughly 1000 mails a day).


Thank you for replicator and regards,
Michael

[1] JFTR: I did similar tests in the past with dsync running from cron every other minute
    with similar results.

Reply via email to