I just double checked and we're seeing the same thing here, although we
were never running with "noac" to begin with. Our poor NFS server would
melt probably. ;)
This is immediately before the crash:
Dec 22 13:09:20 gehenna14.rutgers.edu dovecot: IMAP(user):
fdatasync(/rci/nqu/rci/u5/bruben/dovecot/.INBOX/dovecot-uidlist) failed:
Stale NFS file handle
Dec 22 13:09:20 gehenna14.rutgers.edu dovecot: IMAP(user):
/rci/nqu/rci/u5/bruben/dovecot/.INBOX/dovecot-uidlist: next_uid was
lowered (1507 -> 1506, hdr=1506)
Dec 22 13:12:06 gehenna14.rutgers.edu dovecot: IMAP(user):
/rci/nqu/rci/u5/bruben/dovecot/.INBOX/dovecot-uidlist: Duplicate file
entry at line 3:
1249348137.M369064P26267V04240006I01ED7E09_0.gehenna5.rutgers.edu,S=15857:2,S
(uid 1208 -> 1508)
On 12/22/2009 04:33 PM, Brandon Davidson wrote:
We've started seeing the maildir_uidlist_records_array_delete assert crash as
well. It always seems to be preceded by a 'stale NFS file handle' error from a
the same user on a different connection.
Dec 22 10:12:20 oh-popmap5p dovecot: imap: user=<apbao>, rip=a.a.a.a, pid=2439:
fdatasync(/home11/apbao/Maildir/dovecot-uidlist) failed: Stale NFS file handle
Dec 22 10:12:20 oh-popmap5p dovecot: imap: user=<apbao>, rip=a.a.a.a, pid=2439:
/home11/apbao/Maildir/dovecot-uidlist: next_uid was lowered (2642 -> 2641, hdr=2641)
Dec 22 11:17:26 cc-popmap2p dovecot: imap: user=<apbao>, rip=b.b.b.b,
pid=28088: Panic: file maildir-uidlist.c: line 403
(maildir_uidlist_records_array_delete): assertion failed: (pos != NULL)
Dec 22 11:17:26 cc-popmap2p dovecot: imap: user=<apbao>, rip=b.b.b.b, pid=28088: Raw backtrace: imap [0x4d8986] -> imap [0x4d97b0] ->
imap(i_fatal+0) [0x4d8c7a] -> imap [0x44f2cc] -> imap [0x44f814] -> imap [0x4500a2] -> imap(maildir_uidlist_refresh+0x9d) [0x450686]
-> imap [0x44bff1] -> imap [0x44c0a8] -> imap [0x44c178] -> imap(maildir_storage_sync_init+0x7c) [0x44c6e6] ->
imap(mailbox_sync_init+0x44) [0x489922] -> imap(imap_sync_init+0xab) [0x42e02b] -> imap [0x42f107] -> imap(cmd_sync_delayed+0x1c6)
[0x42f663] -> imap(client_handle_input+0x119) [0x4244d4] -> imap(client_input+0xb4) [0x424594] -> imap(io_loop_handler_run+0x17d)
[0x4e5020] -> imap(io_loop_run+0x3b) [0x4e4214] -> imap(main+0xa6) [0x4300af] -> /lib64/libc.so.6(__libc_start_main+0xf4) [0x3c4ea1d994]
-> imap [0x419aa9]
Dec 22 11:17:26 cc-popmap2p dovecot: dovecot: child 28088 (imap) killed with
signal 6 (core dumped)
Dec 22 13:16:49 cc-popmap3p dovecot: imap: user=<ndunn>, rip=x.x.x.x, pid=3908:
fdatasync(/home2/ndunn/Maildir/dovecot-uidlist) failed: Stale NFS file handle
Dec 22 13:25:16 cc-popmap3p dovecot: imap: user=<ndunn>, rip=y.y.y.y, pid=3228:
Panic: file maildir-uidlist.c: line 403 (maildir_uidlist_records_array_delete):
assertion failed: (pos != NULL)
Dec 22 13:25:16 cc-popmap3p dovecot: imap: user=<ndunn>, rip=y.y.y.y, pid=3228: Raw backtrace: imap [0x4d8986] -> imap [0x4d97b0] ->
imap(i_fatal+0) [0x4d8c7a] -> imap [0x44f2cc] -> imap [0x44f814] -> imap [0x4500a2] -> imap(maildir_uidlist_refresh+0x9d) [0x450686]
-> imap [0x44bff1] -> imap [0x44c0a8] -> imap [0x44c178] -> imap(maildir_storage_sync_init+0x7c) [0x44c6e6] ->
imap(mailbox_sync_init+0x44) [0x489922] -> imap(imap_sync_init+0xab) [0x42e02b] -> imap [0x42f107] -> imap(cmd_sync_delayed+0x1c6)
[0x42f663] -> imap(client_handle_input+0x119) [0x4244d4] -> imap(client_input+0xb4) [0x424594] -> imap(io_loop_handler_run+0x17d)
[0x4e5020] -> imap(io_loop_run+0x3b) [0x4e4214] -> imap(main+0xa6) [0x4300af] -> /lib64/libc.so.6(__libc_start_main+0xf4) [0x3e5021d994]
-> imap [0x419aa9]
Dec 22 13:25:16 cc-popmap3p dovecot: dovecot: child 3228 (imap) killed with
signal 6 (core dumped)
I will note that we did not start seeing this crash until we took 'noac' out of
our NFS mount options, as discussed on this list late last week. On the other
hand, load on our NFS server (as measured in IOPS/sec) has dropped by a factor
of 10.
-Brad
-----Original Message-----
From: dovecot-bounces+brandond=uoregon....@dovecot.org [mailto:dovecot-
bounces+brandond=uoregon....@dovecot.org] On Behalf Of David Halik
Sent: Tuesday, December 22, 2009 7:48 AM
To: dovecot@dovecot.org
Subject: Re: [Dovecot] dovecot-1.2.8 imap crash (with backtrace)
I'm seeing both of these dumps on multiple users now with 1.2.9, so I
went ahead and did backtraces for them both.
maildir_uidlist_records_array_delete panic: http://pastebin.com/f20614d8
ns_get_listed_prefix panic: http://pastebin.com/f1420194c
On 12/21/2009 12:43 PM, David Halik wrote:
Just wanted to update you that I just upgraded all of our servers to
1.2.9 and I'm still seeing the array_delete panic:
Dec 21 12:10:16 gehenna11.rutgers.edu dovecot: IMAP(user1): Panic:
file maildir-uidlist.c: line 403
(maildir_uidlist_records_array_delete): assertion failed: (pos != NULL)
Dec 21 12:15:12 gehenna19.rutgers.edu dovecot: IMAP(user2): Panic:
file maildir-uidlist.c: line 403
(maildir_uidlist_records_array_delete): assertion failed: (pos != NULL)
I also started receiving a good deal of these, but only from one user
so far:
Dec 21 12:16:42 gehenna14.rutgers.edu dovecot: IMAP(user3): Panic:
file cmd-list.c: line 242 (ns_get_listed_prefix): assertion failed:
(match == IMAP_MATCH_YES)
Dec 21 12:18:20 gehenna14.rutgers.edu dovecot: IMAP(user3): Panic:
file cmd-list.c: line 242 (ns_get_listed_prefix): assertion failed:
(match == IMAP_MATCH_YES)
Dec 21 12:18:20 gehenna14.rutgers.edu dovecot: IMAP(user3): Panic:
file cmd-list.c: line 242 (ns_get_listed_prefix): assertion failed:
(match == IMAP_MATCH_YES)
Dec 21 12:19:57 gehenna14.rutgers.edu dovecot: IMAP(user3): Panic:
file cmd-list.c: line 242 (ns_get_listed_prefix): assertion failed:
(match == IMAP_MATCH_YES)
Let me know if you need full backtraces from the core dump.
On 12/17/2009 02:06 PM, David Halik wrote:
On 12/17/2009 01:07 PM, Timo Sirainen wrote:
On Thu, 2009-12-17 at 12:49 -0500, David Halik wrote:
I applied those patches to my 1.2.8 installation before 1.2.9 was
released and that immediately fixed the expunge crash, but the
array_delete bug is still there.
Do you also see the "duplicate file entry" before the crash?
Yes, the duplicate file entry is always reported immediately before
the crash, just as Ralf reported too. You can see it in this example
pastebin I took from one of our users:
http://pastebin.com/f29c55de5
maildir:~/Maildir:INDEX=/rci/nqu%h/dovecot:CONTROL=/rci/nqu%h/dovecot
Are the index/control files on NFS? Are there multiple different
servers
accessing mail data?
Correct. All index, control files, amd mail storage are located on
NFS and there are multiple load balanced servers accessing the NFS
data. We're currently running with:
mmap_disable = yes
dotlock_use_excl = yes
fsync_disable = no
mail_nfs_storage = yes
mail_nfs_index = yes
--
================================
David Halik
System Administrator
OIT-CSS Rutgers University
dha...@jla.rutgers.edu
================================
--
================================
David Halik
System Administrator
OIT-CSS Rutgers University
dha...@jla.rutgers.edu
================================