>
>A few days back, I sent an overview of this problem, but received no
responses. Since then, I have run dozens of traces to isolate the problem,
difficult because there are timing issues involved. I have finally nailed it
down. If this is not the proper place to report such bugs or if someone knows
that this bug has been fixed, please let me know. As I noted in my earlier post,
we have been running Dovecot 2.2.10 with a pair of CentOS 7 boxes with
replications for the past year. We have been quite happy with the performance and
reliability.
>
>Recently we received a report that emails could reappear in the INBOX after being
deleted. After running a pile of traces, I determined that the problem was strangely
related to replications. For the purposes of this discussion, I will refer to the two
symmetric replicating servers as A and B. Further, let us assume that during
"normal" operation, all the emails are delivered to A via SMTP and are replicated
to B. Under those assumptions, if the IMAP user connects to A (where the messages were
originally delivered), there is no problem, at least no problem I was able to find. The
problem I am describing only arises if the IMAP user connects to B. Connecting to B has
never presented any other problems that I am aware of.
>
>The test for which I have provided the trace starts with a test mailbox
containing only 3 unread messages in the INBOX. Moving 1 of the unread messages
to Trash is all that is needed to reproduce the problem. Remember this is ONLY a
problem if the IMAP sessions do not connect to the server to which the messages
were originally delivered. Also, I found that there is a timing window. The
critical IMAP commands are:
>
> UID STORE xxx +FLAGS.SILENT (\Seen)
> UID MOVE xxx Trash
>
>If you introduce a large enough delay (I arbitrarily chose 5 seconds) between
those two commands, there is no problem. Presumably this has to do with the two
boxes syncing up some critical data structure.
What mailbox format do you use? Are you able to reproduce this by running
doveadm sync commands manually instead of letting replication do it? For
example:
- doveadm sync -s "" -d -u user@domain > state
- Run the UID STORE & UID MOVE
- doveadm sync -s "`cat state`" -d -u user@domain
There have been some fixes, especially
recentlyhttps://github.com/dovecot/core/commit/950a6e61d6c2dac961ce031bdd8b2895bc32b827
sounds a bit similar although I don't really see how it would apply here.
Would be a good idea to try anyway with v2.2.22.rc1 (which seems to be stable
enough that I'll make v2.2.22 release soon).
Anyway, I attempted a few times to reproduce it with your test but wasn't able
to.
I was out when you were kind enough to reply. To answer your question,
we are using Maildir format. The trace I provided was based upon IMAP
interactions with Roundcube (though the problem was reproducable with
several mail clients). I left in a few more steps to make the trace
look less contrived. However, I reduced it further to just a couple of
connection sessions. What I found in that exercise that was not
apparent to me in my prior posting was that the "STATUS INBOX" command
that ultimately reveals the problem (it shows the message reappearing)
only becomes "wrong" when it is done in a subsequent session. That is,
even if I inject an artificially large delay after the "UID STORE" /
"UID MOVE" commands before the "STATUS INBOX" command in the same
session, that result is never "wrong". But, as soon as I open a
subsequent IMAP session, the "STATUS INBOX" command then shows the
problematic results. I have never dug into the Dovecot code base, but I
assume this relates to how the session data is cached and how the
replications update it. None of this is relevant if the problem has
already been fixed, so I will endeavor to set up a couple of test boxes
with the current version to verify. The link you provided does look
quite hopeful. Thanks so much.