Re: no full syncs after upgrading to dovecot 2.3.18

Aki Tuomi Wed, 27 Apr 2022 06:02:05 -0700

Hi!

This is probably going to get fixed in 2.3.19, this looks like an issue we are 
already fixing.


Aki

> On 26/04/2022 16:38 Paul Kudla (SCOM.CA Internet Services Inc.) 
> <p...@scom.ca> wrote:
> 
>  
> Agreed there seems to be no way of posting these kinds of issues to see 
> if they are even being addressed or even known about moving forward on 
> new updates
> 
> i read somewhere there is a new branch soming out but nothing as of yet?
> 
> 2.4 maybe ....
> 5.0 ........
> 
> my previous replication issues (back in feb) went unanswered.
> 
> not faulting anyone, but the developers do seem to be disconnected from 
> issues as of late? or concentrating on other issues.
> 
> I have no problem with support contracts for day to day maintence 
> however as a programmer myself they usually dont work as the other end 
> relies on the latest source code anyways. Thus can not help.
> 
> I am trying to take a part the replicator c programming based on 2.3.18 
> as most of it does work to some extent.
> 
> tcps just does not work (ie 600 seconds default in the c programming)
> 
> My thoughts are tcp works ok but fails when the replicator through 
> dsync-client.c when asked to return the folder list?
> 
> 
> replicator-brain.c seems to control the overall process and timing.
> 
> replicator-queue.c seems to handle the que file that does seem to carry 
> acurate info.
> 
> 
> things in the source code are documented enough to figure this out but i 
> am still going through all the related .h files documentation wise which 
> are all over the place.
> 
> there is no clear documentation on the .h lib files so i have to walk 
> through the tree one at a time finding relative code.
> 
> since the dsync from doveadm does see to work ok i have to assume the 
> dsync-client used to compile the replicator is at fault somehow or a 
> call from it upstream?
> 
> Thanks for your input on the other issues noted below, i will keep that 
> in mind when disassembling the source code.
> 
> No sense in fixing one thing and leaving something else behind, probably 
> all related anyways.
> 
> i have two test servers avaliable so i can play with all this offline to 
> reproduce the issues
> 
> Unfortunately I have to make a living first, this will be addressed when 
> possible as i dont like systems that are live running this way and 
> currently only have 5 accounts with this issue (mine included)
> 
> 
> 
> 
> Happy Tuesday !!!
> Thanks - paul
> 
> Paul Kudla
> 
> 
> Scom.ca Internet Services <http://www.scom.ca>
> 004-1009 Byron Street South
> Whitby, Ontario - Canada
> L1N 4S3
> 
> Toronto 416.642.7266
> Main 1.866.411.7266
> Fax 1.888.892.7266
> 
> On 4/26/2022 9:03 AM, Reuben Farrelly wrote:
> > 
> > I ran into this back in February and documented a reproducible test case 
> > (and sent it to this list).  In short - I was able to reproduce this by 
> > having a valid and consistent mailbox on the source/local, creating a 
> > very standard empty Maildir/(new|cur|tmp) folder on the remote replica, 
> > and then initiating the replicate from the source. This consistently 
> > caused dsync to fail replication with the error "dovecot.index reset, 
> > view is now inconsistent" and sync aborted, leaving the replica mailbox 
> > in a screwed up inconsistent state. Client connections on the source 
> > replica were also dropped when this error occurred.  You can see the 
> > error by enabling debug level logging if you initiate dsync manually on 
> > a test mailbox.
> > 
> > The only workaround I found was to remove the remote Maildir and let 
> > Dovecot create the whole thing from scratch.  Dovecot did not like any 
> > existing folders on the destination replica even if they were the same 
> > names as the source and completely empty.  I was able to reproduce this 
> > the bare minimum of folders - just an INBOX!
> > 
> > I have no idea if any of the developers saw my post or if the bug has 
> > been fixed for the next release.  But it seemed to be quite a common 
> > problem over time (saw a few posts from people going back a long way 
> > with the same problem) and it is seriously disruptive to clients.  The 
> > error message is not helpful in tracking down the problem either.
> > 
> > Secondly, I also have had an ongoing and longstanding problem using 
> > tcps: for replication.  For some reason using tcps: (with no other 
> > changes at all to the config) results in a lot of timeout messages 
> > "Error: dsync I/O has stalled, no activity for 600 seconds".  This goes 
> > away if I revert back to tcp: instead of tcps - with tcp: I very rarely 
> > get timeouts.  No idea why, guess this is a bug of some sort also.
> > 
> > It's disappointing that there appears to be no way to have these sorts 
> > or problems addressed like there once was.  I am not using Dovecot for 
> > commercial purposes so paying a fortune for a support contract for a 
> > high end installation just isn't going to happen, and this list seems to 
> > be quite ordinary for getting support and reporting bugs nowadays....
> > 
> > Reuben
> > 
> > On 26/04/2022 7:21 pm, Paul Kudla (SCOM.CA Internet Services Inc.) wrote:
> > 
> >>
> >> side issue
> >>
> >> if you are getting inconsistant dsyncs there is no real way to fix 
> >> this in the long run.
> >>
> >> i know its a pain (already had to my self)
> >>
> >> i needed to do a full sync, take one server offline, delete the user 
> >> dir (with dovecot offline) and then rsync (or somehow duplicate the 
> >> main server's user data) over the the remote again.
> >>
> >> then bring remote back up and it kind or worked worked
> >>
> >> best suggestion is to bring the main server down at night so the copy 
> >> is clean?
> >>
> >> if using postfix you can enable the soft bounce option and the mail 
> >> will back spool until everything comes back online
> >>
> >> (needs to be enable on bother servers)
> >>
> >> replication was still an issue on accounts with 300+ folders in them, 
> >> still working on a fix for that.
> >>
> >>
> >> Happy Tuesday !!!
> >> Thanks - paul
> >>
> >> Paul Kudla
> >>
> >>
> >> Scom.ca Internet Services <http://www.scom.ca>
> >> 004-1009 Byron Street South
> >> Whitby, Ontario - Canada
> >> L1N 4S3
> >>
> >> Toronto 416.642.7266
> >> Main 1.866.411.7266
> >> Fax 1.888.892.7266
> >>
> >> On 4/25/2022 10:01 AM, Arnaud Abélard wrote:
> >>> Ah, I'm now getting errors in the logs, that would explains the 
> >>> increasing number of failed sync requests:
> >>>
> >>> dovecot: imap(xxxxx)<2961235><Bs6w43rdQPAqAcsFiXEmAInUhhA3Rfqh>: 
> >>> Error: Mailbox INBOX: /vmail/l/i/xxxxx/dovecot.index reset, view is 
> >>> now inconsistent
> >>>
> >>>
> >>> And sure enough:
> >>>
> >>> # dovecot replicator status xxxxx
> >>>
> >>> xxxxx         none     00:02:54  07:11:28  -            y
> >>>
> >>>
> >>> What could explain that error?
> >>>
> >>> Arnaud
> >>>
> >>>
> >>>
> >>> On 25/04/2022 15:13, Arnaud Abélard wrote:
> >>>> Hello,
> >>>>
> >>>> On my side we are running Linux (Debian Buster).
> >>>>
> >>>> I'm not sure my problem is actually the same as Paul or you 
> >>>> Sebastian since I have a lot of boxes but those are actually small 
> >>>> (quota of 110MB) so I doubt any of them have more than a dozen imap 
> >>>> folders.
> >>>>
> >>>> The main symptom is that I have tons of full sync requests awaiting 
> >>>> but even though no other sync is pending the replicator just waits 
> >>>> for something to trigger those syncs.
> >>>>
> >>>> Today, with users back I can see that normal and incremental syncs 
> >>>> are being done on the 15 connections, with an occasional full sync 
> >>>> here or there and lots of "Waiting 'failed' requests":
> >>>>
> >>>> Queued 'sync' requests        0
> >>>>
> >>>> Queued 'high' requests        0
> >>>>
> >>>> Queued 'low' requests         0
> >>>>
> >>>> Queued 'failed' requests      122
> >>>>
> >>>> Queued 'full resync' requests 28785
> >>>>
> >>>> Waiting 'failed' requests     4294
> >>>>
> >>>> Total number of known users   42512
> >>>>
> >>>>
> >>>>
> >>>> So, why didn't the replicator take advantage of the weekend to 
> >>>> replicate the mailboxes while no user were using them?
> >>>>
> >>>> Arnaud
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On 25/04/2022 13:54, Sebastian Marske wrote:
> >>>>> Hi there,
> >>>>>
> >>>>> thanks for your insights and for diving deeper into this Paul!
> >>>>>
> >>>>> For me, the users ending up in 'Waiting for dsync to finish' all have
> >>>>> more than 256 Imap folders as well (ranging from 288 up to >5500; 
> >>>>> as per
> >>>>> 'doveadm mailbox list -u <username> | wc -l'). For more details on my
> >>>>> setup please see my post from February [1].
> >>>>>
> >>>>> @Arnaud: What OS are you running on?
> >>>>>
> >>>>>
> >>>>> Best
> >>>>> Sebastian
> >>>>>
> >>>>>
> >>>>> [1] https://dovecot.org/pipermail/dovecot/2022-February/124168.html
> >>>>>
> >>>>>
> >>>>> On 4/24/22 19:36, Paul Kudla (SCOM.CA Internet Services Inc.) wrote:
> >>>>>>
> >>>>>> Question having similiar replication issues
> >>>>>>
> >>>>>> pls read everything below and advise the folder counts on the
> >>>>>> non-replicated users?
> >>>>>>
> >>>>>> i find  the total number of folders / account seems to be a factor 
> >>>>>> and
> >>>>>> NOT the size of the mail box
> >>>>>>
> >>>>>> ie i have customers with 40G of emails no problem over 40 or so 
> >>>>>> folders
> >>>>>> and it works ok
> >>>>>>
> >>>>>> 300+ folders seems to be the issue
> >>>>>>
> >>>>>> i have been going through the replication code
> >>>>>>
> >>>>>> no errors being logged
> >>>>>>
> >>>>>> i am assuming that the replication --> dhclient --> other server is
> >>>>>> timing out or not reading the folder lists correctly (ie dies after X
> >>>>>> folders read)
> >>>>>>
> >>>>>> thus i am going through the code patching for log entries etc to find
> >>>>>> the issues.
> >>>>>>
> >>>>>> see
> >>>>>>
> >>>>>> [13:33:57] mail18.scom.ca [root:0] /usr/local/var/lib/dovecot
> >>>>>> # ll
> >>>>>> total 86
> >>>>>> drwxr-xr-x  2 root  wheel  uarch    4B Apr 24 11:11 .
> >>>>>> drwxr-xr-x  4 root  wheel  uarch    4B Mar  8  2021 ..
> >>>>>> -rw-r--r--  1 root  wheel  uarch   73B Apr 24 11:11 instances
> >>>>>> -rw-r--r--  1 root  wheel  uarch  160K Apr 24 13:33 replicator.db
> >>>>>>
> >>>>>> [13:33:58] mail18.scom.ca [root:0] /usr/local/var/lib/dovecot
> >>>>>> #
> >>>>>>
> >>>>>> replicator.db seems to get updated ok but never processed properly.
> >>>>>>
> >>>>>> # sync.users
> >>>>>> n...@elirpa.com                   high     00:09:41 463:47:01 -     y
> >>>>>> ke...@elirpa.com                  high     00:09:23 463:45:43 -     y
> >>>>>> p...@scom.ca                      high     00:09:41 463:46:51 -     y
> >>>>>> e...@scom.ca                        high     00:09:43 463:47:01 -     y
> >>>>>> ed.ha...@dssmgmt.com              high     00:09:42 463:46:58 -     y
> >>>>>> p...@paulkudla.net                high     00:09:44 463:47:03 
> >>>>>> 580:35:07
> >>>>>>     y
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> so ....
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> two things :
> >>>>>>
> >>>>>> first to get the production stuff to work i had to write a script 
> >>>>>> that
> >>>>>> whould find the bad sync's and the force a dsync between the servers
> >>>>>>
> >>>>>> i run this every five minutes or each server.
> >>>>>>
> >>>>>> in crontab
> >>>>>>
> >>>>>> */10    *                *    *    *    root /usr/bin/nohup
> >>>>>> /programs/common/sync.recover > /dev/null
> >>>>>>
> >>>>>>
> >>>>>> python script to sort things out
> >>>>>>
> >>>>>> # cat /programs/common/sync.recover
> >>>>>> #!/usr/local/bin/python3
> >>>>>>
> >>>>>> #Force sync between servers that are reporting bad?
> >>>>>>
> >>>>>> import os,sys,django,socket
> >>>>>> from optparse import OptionParser
> >>>>>>
> >>>>>>
> >>>>>> from lib import *
> >>>>>>
> >>>>>> #Sample Re-Index MB
> >>>>>> #doveadm -D force-resync -u p...@scom.ca -f INBOX*
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> USAGE_TEXT = '''\
> >>>>>> usage: %%prog %s[options]
> >>>>>> '''
> >>>>>>
> >>>>>> parser = OptionParser(usage=USAGE_TEXT % '', version='0.4')
> >>>>>>
> >>>>>> parser.add_option("-m", "--send_to", dest="send_to", help="Send 
> >>>>>> Email To")
> >>>>>> parser.add_option("-e", "--email", dest="email_box", help="Box to 
> >>>>>> Index")
> >>>>>> parser.add_option("-d", "--detail",action='store_true',
> >>>>>> dest="detail",default =False, help="Detailed report")
> >>>>>> parser.add_option("-i", "--index",action='store_true',
> >>>>>> dest="index",default =False, help="Index")
> >>>>>>
> >>>>>> options, args = parser.parse_args()
> >>>>>>
> >>>>>> print (options.email_box)
> >>>>>> print (options.send_to)
> >>>>>> print (options.detail)
> >>>>>>
> >>>>>> #sys.exit()
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> print ('Getting Current User Sync Status')
> >>>>>> command = commands("/usr/local/bin/doveadm replicator status '*'")
> >>>>>>
> >>>>>>
> >>>>>> #print command
> >>>>>>
> >>>>>> sync_user_status = command.output.split('\n')
> >>>>>>
> >>>>>> #print sync_user_status
> >>>>>>
> >>>>>> synced = []
> >>>>>>
> >>>>>> for n in range(1,len(sync_user_status)) :
> >>>>>>          user = sync_user_status[n]
> >>>>>>          print ('Processing User : %s' %user.split(' ')[0])
> >>>>>>          if user.split(' ')[0] != options.email_box :
> >>>>>>                  if options.email_box != None :
> >>>>>>                          continue
> >>>>>>
> >>>>>>          if options.index == True :
> >>>>>>                  command = '/usr/local/bin/doveadm -D force-resync 
> >>>>>> -u %s
> >>>>>> -f INBOX*' %user.split(' ')[0]
> >>>>>>                  command = commands(command)
> >>>>>>                  command = command.output
> >>>>>>
> >>>>>>          #print user
> >>>>>>          for nn in range (len(user)-1,0,-1) :
> >>>>>>                  #print nn
> >>>>>>                  #print user[nn]
> >>>>>>
> >>>>>>                  if user[nn] == '-' :
> >>>>>>                          #print 'skipping ... %s' %user.split(' ')[0]
> >>>>>>
> >>>>>>                          break
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>                  if user[nn] == 'y': #Found a Bad Mailbox
> >>>>>>                          print ('syncing ... %s' %user.split(' ')[0])
> >>>>>>
> >>>>>>
> >>>>>>                          if options.detail == True :
> >>>>>>                                  command = '/usr/local/bin/doveadm -D
> >>>>>> sync -u %s -d -N -l 30 -U' %user.split(' ')[0]
> >>>>>>                                  print (command)
> >>>>>>                                  command = commands(command)
> >>>>>>                                  command = command.output.split('\n')
> >>>>>>                                  print (command)
> >>>>>>                                  print ('Processed Mailbox for ... 
> >>>>>> %s'
> >>>>>> %user.split(' ')[0] )
> >>>>>>                                  synced.append('Processed Mailbox 
> >>>>>> for ...
> >>>>>> %s' %user.split(' ')[0])
> >>>>>>                                  for nnn in range(len(command)):
> >>>>>> synced.append(command[nnn] + '\n')
> >>>>>>                                  break
> >>>>>>
> >>>>>>
> >>>>>>                          if options.detail == False :
> >>>>>>                                  #command = 
> >>>>>> '/usr/local/bin/doveadm -D
> >>>>>> sync -u %s -d -N -l 30 -U' %user.split(' ')[0]
> >>>>>>                                  #print (command)
> >>>>>>                                  #command = os.system(command)
> >>>>>>                                  command = subprocess.Popen(
> >>>>>> ["/usr/local/bin/doveadm sync -u %s -d -N -l 30 -U" %user.split(' 
> >>>>>> ')[0]
> >>>>>> ], \
> >>>>>>                                  shell = True, stdin=None, 
> >>>>>> stdout=None,
> >>>>>> stderr=None, close_fds=True)
> >>>>>>
> >>>>>>                                  print ( 'Processed Mailbox for 
> >>>>>> ... %s'
> >>>>>> %user.split(' ')[0] )
> >>>>>>                                  synced.append('Processed Mailbox 
> >>>>>> for ...
> >>>>>> %s' %user.split(' ')[0])
> >>>>>>                                  #sys.exit()
> >>>>>>                                  break
> >>>>>>
> >>>>>> if len(synced) != 0 :
> >>>>>>          #send email showing bad synced boxes ?
> >>>>>>
> >>>>>>          if options.send_to != None :
> >>>>>>                  send_from = 'moni...@scom.ca'
> >>>>>>                  send_to = ['%s' %options.send_to]
> >>>>>>                  send_subject = 'Dovecot Bad Sync Report for : %s'
> >>>>>> %(socket.gethostname())
> >>>>>>                  send_text = '\n\n'
> >>>>>>                  for n in range (len(synced)) :
> >>>>>>                          send_text = send_text + synced[n] + '\n'
> >>>>>>
> >>>>>>                  send_files = []
> >>>>>>                  sendmail (send_from, send_to, send_subject, 
> >>>>>> send_text,
> >>>>>> send_files)
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> sys.exit()
> >>>>>>
> >>>>>> second :
> >>>>>>
> >>>>>> i posted this a month ago - no response
> >>>>>>
> >>>>>> please appreciate that i am trying to help ....
> >>>>>>
> >>>>>> after much testing i can now reporduce the replication issues at hand
> >>>>>>
> >>>>>> I am running on freebsd 12 & 13 stable (both test and production 
> >>>>>> servers)
> >>>>>>
> >>>>>> sdram drives etc ...
> >>>>>>
> >>>>>> Basically replication works fine until reaching a folder quantity 
> >>>>>> of ~
> >>>>>> 256 or more
> >>>>>>
> >>>>>> to reproduce using doveadm i created folders like
> >>>>>>
> >>>>>> INBOX/folder-0
> >>>>>> INBOX/folder-1
> >>>>>> INBOX/folder-2
> >>>>>> INBOX/folder-3
> >>>>>> and so forth ......
> >>>>>>
> >>>>>> I created 200 folders and they replicated ok on both servers
> >>>>>>
> >>>>>> I created another 200 (400 total) and the replicator got stuck and 
> >>>>>> would
> >>>>>> not update the mbox on the alternate server anymore and is still
> >>>>>> updating 4 days later ?
> >>>>>>
> >>>>>> basically replicator goes so far and either hangs or more likely 
> >>>>>> bails
> >>>>>> on an error that is not reported to the debug reporting ?
> >>>>>>
> >>>>>> however dsync will sync the two servers but only when run manually 
> >>>>>> (ie
> >>>>>> all the folders will sync)
> >>>>>>
> >>>>>> I have two test servers avaliable if you need any kind of access - 
> >>>>>> again
> >>>>>> here to help.
> >>>>>>
> >>>>>> [07:28:42] mail18.scom.ca [root:0] ~
> >>>>>> # sync.status
> >>>>>> Queued 'sync' requests        0
> >>>>>> Queued 'high' requests        6
> >>>>>> Queued 'low' requests         0
> >>>>>> Queued 'failed' requests      0
> >>>>>> Queued 'full resync' requests 0
> >>>>>> Waiting 'failed' requests     0
> >>>>>> Total number of known users   255
> >>>>>>
> >>>>>> username                       type        status
> >>>>>> p...@scom.ca                   normal      Waiting for dsync to 
> >>>>>> finish
> >>>>>> ke...@elirpa.com               incremental Waiting for dsync to 
> >>>>>> finish
> >>>>>> ed.ha...@dssmgmt.com           incremental Waiting for dsync to 
> >>>>>> finish
> >>>>>> e...@scom.ca                     incremental Waiting for dsync to 
> >>>>>> finish
> >>>>>> n...@elirpa.com                incremental Waiting for dsync to 
> >>>>>> finish
> >>>>>> p...@paulkudla.net             incremental Waiting for dsync to 
> >>>>>> finish
> >>>>>>
> >>>>>>
> >>>>>> i have been going through the c code and it seems the replication 
> >>>>>> gets
> >>>>>> requested ok
> >>>>>>
> >>>>>> replicator.db does get updated ok with the replicated request for the
> >>>>>> mbox in question.
> >>>>>>
> >>>>>> however i am still looking for the actual replicator function in the
> >>>>>> lib's that do the actual replication requests
> >>>>>>
> >>>>>> the number of folders & subfolders is defanately the issue - not the
> >>>>>> mbox pyhsical size as thought origionally.
> >>>>>>
> >>>>>> if someone can point me in the right direction, it seems either the
> >>>>>> replicator is not picking up on the number of folders to replicat
> >>>>>> properly or it has a hard set limit like 256 / 512 / 65535 etc and 
> >>>>>> stops
> >>>>>> the replication request thereafter.
> >>>>>>
> >>>>>> I am mainly a machine code programmer from the 80's and have
> >>>>>> concentrated on python as of late, 'c' i am starting to go through 
> >>>>>> just
> >>>>>> to give you a background on my talents.
> >>>>>>
> >>>>>> It took 2 months to finger this out.
> >>>>>>
> >>>>>> this issue also seems to be indirectly causing the duplicate messages
> >>>>>> supression not to work as well.
> >>>>>>
> >>>>>> python programming to reproduce issue (loops are for last run 
> >>>>>> started @
> >>>>>> 200 - fyi) :
> >>>>>>
> >>>>>> # cat mbox.gen
> >>>>>> #!/usr/local/bin/python2
> >>>>>>
> >>>>>> import os,sys
> >>>>>>
> >>>>>> from lib import *
> >>>>>>
> >>>>>>
> >>>>>> user = 'p...@paulkudla.net'
> >>>>>>
> >>>>>> """
> >>>>>> for count in range (0,600) :
> >>>>>>          box = 'INBOX/folder-%s' %count
> >>>>>>          print count
> >>>>>>          command = '/usr/local/bin/doveadm mailbox create -s -u %s 
> >>>>>> %s'
> >>>>>> %(user,box)
> >>>>>>          print command
> >>>>>>          a = commands.getoutput(command)
> >>>>>>          print a
> >>>>>> """
> >>>>>>
> >>>>>> for count in range (0,600) :
> >>>>>>          box = 'INBOX/folder-0/sub-%' %count
> >>>>>>          print count
> >>>>>>          command = '/usr/local/bin/doveadm mailbox create -s -u %s 
> >>>>>> %s'
> >>>>>> %(user,box)
> >>>>>>          print command
> >>>>>>          a = commands.getoutput(command)
> >>>>>>          print a
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>          #sys.exit()
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> Happy Sunday !!!
> >>>>>> Thanks - paul
> >>>>>>
> >>>>>> Paul Kudla
> >>>>>>
> >>>>>>
> >>>>>> Scom.ca Internet Services <http://www.scom.ca>
> >>>>>> 004-1009 Byron Street South
> >>>>>> Whitby, Ontario - Canada
> >>>>>> L1N 4S3
> >>>>>>
> >>>>>> Toronto 416.642.7266
> >>>>>> Main 1.866.411.7266
> >>>>>> Fax 1.888.892.7266
> >>>>>>
> >>>>>> On 4/24/2022 10:22 AM, Arnaud Abélard wrote:
> >>>>>>> Hello,
> >>>>>>>
> >>>>>>> I am working on replicating a server (and adding compression on the
> >>>>>>> other side) and since I had "Error: dsync I/O has stalled, no 
> >>>>>>> activity
> >>>>>>> for 600 seconds (version not received)" errors I upgraded both 
> >>>>>>> source
> >>>>>>> and destination server with the latest 2.3 version (2.3.18). While
> >>>>>>> before the upgrade all the 15 replication connections were busy 
> >>>>>>> after
> >>>>>>> upgrading dovecot replicator dsync-status shows that most of the 
> >>>>>>> time
> >>>>>>> nothing is being replicated at all. I can see some brief 
> >>>>>>> replications
> >>>>>>> that last, but 99,9% of the time nothing is happening at all.
> >>>>>>>
> >>>>>>> I have a replication_full_sync_interval of 12 hours but I have
> >>>>>>> thousands of users with their last full sync over 90 hours ago.
> >>>>>>>
> >>>>>>> "doveadm replicator status" also shows that i have over 35,000 
> >>>>>>> queued
> >>>>>>> full resync requests, but no sync, high or low queued requests so 
> >>>>>>> why
> >>>>>>> aren't the full requests occuring?
> >>>>>>>
> >>>>>>> There are no errors in the logs.
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>>
> >>>>>>> Arnaud
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>
> >>>
> >

Re: no full syncs after upgrading to dovecot 2.3.18

Reply via email to