I wanted to followup on my NFS lock issue with dovecot-uidlist.

After doing some research, the current FreeBSD NFS client (as of 6.2- STABLE at least) appears to have a long-standing bug with caching on files with high create/removal rates. With the NFS access cache enabled or disabled, the NFS client still uses another cache for certain file attributes and requires at least a second to go by before it will invalidate an entry if it was deleted. If the file attributes are accessed before the second is up, the timer is restarted.

Since the dotlocking code in Dovecot micro-sleeps for less than a second between each check for the .lock file, the entry is never removed from the cache's cache, so the lstat() on the lock file always returns 0 (success). This never allows the lock file to be re- created until the stall timeout is reached. All Dovecot processes (IMAP, POP3, deliver) hang until the kernel invalidates the entry, causing the problem. Using a sleep() call > 1 second after removing the lock and before attempting to use it again helps, but is obviously not very performance-friendly for a high-volume mail server.

The other solution I've found that seems to work is updating the mtime on the .lock file if all other dotlocking checks fail in check_lock() in src/lib/file-dotlock.c (see attached patch). This invalidates the cached entry in the kernel and allows lstat() to return the correct response (-1), as the .lock file no longer exists. I didn't check to see if the utime() fails, as it just means the kernel invalidated the entry when it should have and can be ignored.

I have performed some high-volume delivery (deliver) and pickup testing (imap and pop3) using the workaround, and so far everything has worked as expected for all Dovecot control files, including indexes.

Does anyone know of any side effects the forced mtime update may have that I may not be seeing?

Thanks again for any assistance.

-Doug

Attachment: file-dotlock.c.diff
Description: Binary data


On May 17, 2007, at 10:45 AM, Doug Council wrote:

We are in the process of migrating away from Courier-IMAP/POP3 and Maildrop. I want to use Dovecot (LDA, IMAP, POP3). During my testing, it has worked great except for dotlocking on the dovecot- uidlist file.

The problem:

When a delivery is being made with deliver and a mail client has the mailbox open (Thunderbird in this case), neither Thunderbird or deliver can get a dotlock on the dovecot-uidlist file, causing both deliver and Thunderbird to hang until the dotlock timeout runs out and the lock gets replaced. Once the lock is replaced, both will go about their business until the next lock miss and hang again. Eventually, everything is delivered and Thunderbird wakes up.

Looking at each of the processes with truss, they are looping trying to stat the dotcot-uidlist.lock file, which no longer exists.

We are using NFS, and based on reading through the mailing list archives, it can be a little difficult to get working reliably. But, I've read quite a few posts with our same or similar configuration having good luck with the setup. To reduce multiple box access-issues for now, I've been doing all testing with a single NFS client.

Our configuration:

NetApp filers for storage
FreeBSD 6.2-RELEASE NFS clients
Postfix 2.3.9 MTA
Dovecot 1.0.0 LDA for local deliveries
Dovecot 1.0.0 IMAP for pickup

My dovecot.conf file is at the end of this message. NFS access cachcing on the FreeBSD has been disabled (vfs.nfs.access_cache_timeout = 0, see NFS mount options below). Postfix destination recipient and concurrency limit for the Dovecot LDA is set to 1.

The NFS mount options:

rw,tcp,-r=32768,-w=32768,nfsv3,dumbtimer,noatime,acregmin=0,
acregmax=0,acdirmin=0,acdirmax=0

The dovecot.conf file:

protocols = imap imaps pop3 pop3s
disable_plaintext_auth = no
syslog_facility = local0
ssl_cert_file = /nethere/conf/dovecot/ssl-nh-cert.pem
ssl_key_file = /nethere/conf/dovecot/ssl-nh-key.pem
login_greeting = Server ready.
login_log_format_elements = user=<%u> ip=[%r] method=%m encryption=% c pid=%p
login_log_format = %U$: %s
mail_location = maildir:~/Maildir:INDEX=MEMORY
mmap_disable = yes
dotlock_use_excl = no
lock_method = dotlock
first_valid_uid = 200
last_valid_uid = 200
first_valid_gid = 200
last_valid_gid = 200
maildir_copy_with_hardlinks = yes

namespace private {
  prefix = INBOX.
  inbox = yes
}

protocol imap {
  login_executable = /usr/local/libexec/dovecot/imap-login
  mail_executable = /usr/local/libexec/dovecot/imap
  imap_client_workarounds = outlook-idle delay-newmail
}

protocol pop3 {
  login_executable = /usr/local/libexec/dovecot/pop3-login
  mail_executable = /usr/local/libexec/dovecot/pop3
  pop3_uidl_format = UID%u-%v
  pop3_client_workarounds = outlook-no-nuls oe-ns-eoh
}

protocol lda {
  postmaster_address = [EMAIL PROTECTED]
  sendmail_path = /usr/sbin/sendmail
  auth_socket_path = /var/run/dovecot/auth-master
  syslog_facility = mail
}

auth_executable = /usr/local/libexec/dovecot/dovecot-auth

auth default {
  mechanisms = plain digest-md5 cram-md5
  passdb ldap {
    args = /nethere/conf/dovecot/dovecot-ldap.conf
  }
  userdb ldap {
    args = /nethere/conf/dovecot/dovecot-ldap.conf
  }
  user = root
  socket listen {
    master {
      path = /var/run/dovecot/auth-master
      mode = 0600
      user = mailuser
      group = mailuser
    }
  }
}

It may just be "how it works", but the lock contention seems a little too fragile for busy mailboxes.

Does anyone have any ideas?  Thanks in advance for any assistance.

-Doug

Reply via email to