Am 2018-06-07 07:34, schrieb Remko Lodder:
On 7 Jun 2018, at 07:21, Reuben Farrelly <reuben-dove...@reub.net>
wrote:
Still not quite right for me.
Jun 7 15:11:33 thunderstorm.reub.net dovecot: doveadm: Error:
dsync(lightning.reub.net): I/O has stalled, no activity for 600
seconds (last sent=mail, last recv=mail (EOL))
Jun 7 15:11:33 thunderstorm.reub.net dovecot: doveadm: Error: Timeout
during state=sync_mails (send=mails recv=recv_last_common)
I'm not sure if there is an underlying replication error or if the
message is just cosmetic, though.
Admittedly I have had a few occurences of this behaviour as well last
night. It happens more sporadic now and seems to be a conflict with my
user settings. (My users
get added twice to the system, user-domain.tld and u...@domain.tld,
both are being replicated, the noreplicate flag is not yet honored in
the version I am using so I cannot
bypass that yet).
I do see messages that came on the other machine on the machine that I
am using to read these emails. So replication seems to work in that
regard (where it obviously
did not do that well before).
First of all: Major improvement by this patch applied to 2.3.1, there
are no more hanging processes.
But: I do find quite a number of error messages like:
Jun 7 06:34:20 mail dovecot: doveadm: Error: Failed to lock mailbox
NAME for dsyncing: \
file_create_locked(/.../USER/mailboxes/NAME/dbox-Mails/.dovecot-box-sync.lock)
\
failed:
fcntl(/.../USER/mailboxes/NAME/dbox-Mails/.dovecot-box-sync.lock,
write-lock, F_SETLKW) \
locking failed: Timed out after 30 seconds (WRITE lock held by pid
79452)
These messages are only found at that server which is normally receiving
synced messages (because almost all mail is received via the other
master due to MX priorities).
Conclusion: After 12 hours of running a patched FBSD port I do get those
error messages but replictaion seems to work now. But, I still have the
feeling that there might something else going wrong.
@Timo: Wouldn't it be worth to add replicator/aggreator error messages
to head like Aki sent to Remko? That might add some light into
replication issues today and in the future.
Regards,
Michael