On 11/30/10 09:33, John Baldwin wrote:
On Monday, November 29, 2010 8:06:54 pm Adam McDougall wrote:
I've been running dovecot 1.1 on FreeBSD 7.x for a while with a bare
minimum of NFS problems, but it got worse with 8.x. I have 2-4 servers
(usually just 2) accessing mail on a Netapp over NFSv3 via imapd.
delivery is via procmail which doesn't touch the dovecot metadata and
webmail uses imapd. Client connections to imapd go to random servers
and I don't yet have solid means to keep certain users on certain
servers. I upgraded some of the servers to 8.x and dovecot 1.2 and ran
into Stale NFS file handles causing index/uidlist corruption causing
inboxes to appear as empty when they were not. In some situations their
corrupt index had to be deleted manually. I first suspected dovecot 1.2
since it was upgraded at the same time but I downgraded to 1.1 and its
doing the same thing. I don't really have a wealth of details to go on
yet and I usually stay quiet until I do, and half the time it is
difficult to reproduce myself so I've had to put it in production to get
a feel for progress. This only happens a dozen or so times per weekday
but I feel the need to start taking bigger steps. I'll probably do what
I can to get IMAP back on a stable base (7.x?) and also try to debug 8.x
on the remaining servers. A binary search is within possibility if I
can reproduce the symptoms often enough even if I have to put a test
server in production for a few hours.
There were some changes to allow more concurrency in the NFS client in 8 (and
7.2+) that caused ESTALE errors to occur on open(2) more frequently. You can
try setting 'vfs.lookup_shared=0' to disable the extra concurrency (but at a
performance cost) as a workaround. The most recent 7.x and 8.x have some
changes to open(2) to minimize ESTALE errors that I think get it back to the
same level as when lookup_shared is set to 0.
I tried vfs.lookup_shared=0 on two of the three already with no help
(forgot what it was called or I would have mentioned it), and I also
tried vfs.nfs.prime_access_cache=1 on a guess on all three but that
didn't help either. I'll go through the other suggestions and see where
it gets me. Thanks all for the input.
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"