Quoting Timo Sirainen <t...@iki.fi>:

With a quick test I can't reproduce pop3_lock_session=yes causing a
crash. I guess it needs something else besides what I tested. It would
be helpful if your Dovecot binaries weren't stripped of debug symbols. I
could then ask for some more information from the core dumps with gdb.


Hi Timo,

indeed it is a bug that I could not reproduce myself.
Having debug symbols and producing the stack trace is the next
logical step and I will work on this tomorrow.
Since --enable-debug does not work in your configure script, can you
direct me as to what is needed? Is there an option in configure or
do I need to mess with the makefiles?

On the other hand, I have found two different bugs.
Having pop3_lock_session=yes we have the situation described here and also
of course delays in local deliveries in case a client has an active pop
session. And I can tell you we have a lot of abusing clients that keep
hitting our pop servers continuously, or keep connections open for a VERY
long time.

To address that, we put pop3_lock_session=no. In this case, there is an fcntl
lock leak somewhere. The good news is that we have reproduced that and I will
send relevant information in a different mail.
I also read the following thread, from a while back:

http://www.dovecot.org/list/dovecot/2009-February/037098.html

Regards,

Kostas

On Wed, 2011-08-10 at 13:07 +0300, Kostas Zorbadelos wrote:
On 07/22/2011 01:02 PM, Kostas Zorbadelos wrote:

Hello,

since I saw no action on this, here is a newer update we discovered today.

After setting pop3_lock_session = no the core dumps went away.
We will leave it like that and watch it for the next few days. If we set
pop3_lock_session = yes, the problem is reproduced.

If I can do anything else to help debug the problem, please let me know.

Regards,

Kostas

> Greetings to all.
>
> It's my first post to the list. We just completed a migration from qpopper to dovecot
> for our IMAP and POP3 services. We have a rather large mail environment
> (we are the biggest provider in Greece).
>
> So, here are the details:
>
> - Keep getting errors like these in our production environment
>
> Jul 22 00:18:21 pop01 dovecot: master: Error: service(pop3): child 4078 killed with signal 11 (core dumps disabled) > Jul 22 00:19:31 pop03 dovecot: master: Error: service(pop3): child 18849 killed with signal 11 (core dumps disabled)
>
> ---------------------------------------------------------------------
> dovecot -n output
> ---------------------------------------------------------------------
> /opt/dovecot/sbin/dovecot -n
> # 2.0.13: /opt/dovecot/etc/dovecot/dovecot.conf
> # OS: Linux 2.6.18-92.1.22.el5 x86_64 CentOS release 5.5 (Final)
> auth_cache_negative_ttl = 10 mins
> auth_cache_size = 5 M
> auth_cache_ttl = 10 mins
> auth_verbose = yes
> default_client_limit = 5000
> default_process_limit = 500
> disable_plaintext_auth = no
> first_valid_uid = 200
> listen = *
> log_timestamp = "%Y-%m-%d %H:%M:%S "
> login_greeting =<COMPANY>  ready
> mail_access_groups = mail otemail disk root
> mail_fsync = always
> mail_location = mbox:INDEX=/var/index/dovecot/%2.16Hn/%2.254Hn/%u
> mail_nfs_storage = yes
> mbox_lock_timeout = 2 mins
> mbox_min_index_size = 200 k
> mbox_read_locks = dotlock_try fcntl
> mbox_write_locks = dotlock_try fcntl
> passdb {
>    args = /opt/dovecot/etc/dovecot/dovecot-ldap.conf.ext
>    driver = ldap
> }
> protocols = imap pop3
> service auth-worker {
>    user = dovenull
> }
> service imap-login {
>    inet_listener imap {
>      port = 143
>    }
>    inet_listener imaps {
>      port = 993
>      ssl = yes
>    }
> }
> service pop3-login {
>    inet_listener pop3 {
>      port = 110
>    }
>    inet_listener pop3s {
>      port = 995
>      ssl = yes
>    }
> }
> ssl = no
> userdb {
>    args = /opt/dovecot/etc/dovecot/dovecot-ldap.conf.ext
>    driver = ldap
> }
> verbose_proctitle = yes
> protocol imap {
>    imap_client_workarounds = delay-newmail tb-extra-mailbox-sep
>    mail_max_userip_connections = 100
> }
> protocol pop3 {
>    mail_max_userip_connections = 100
>    pop3_client_workarounds = outlook-no-nuls oe-ns-eoh
>    pop3_fast_size_lookups = yes
>    pop3_lock_session = yes
>    pop3_reuse_xuidl = yes
>    pop3_uidl_format = %08Xu%08Xv
> }
>
> I enabled core dumps in one of our backend servers and here is the relevant gdb trace:
>
> [root@pop08 ~]# gdb /opt/dovecot/libexec/dovecot/pop3<path_to_core_file>/core.9273
> GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-32.el5_6.2)
> Copyright (C) 2009 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later<http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-redhat-linux-gnu".
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>...
> Reading symbols from /opt/dovecot/libexec/dovecot/pop3...(no debugging symbols found)...done. > Reading symbols from /opt/dovecot/lib/dovecot/libdovecot-storage.so.0...(no debugging symbols found)...done.
> Loaded symbols for /opt/dovecot/lib/dovecot/libdovecot-storage.so.0
> Reading symbols from /opt/dovecot/lib/dovecot/libdovecot.so.0...(no debugging symbols found)...done.
> Loaded symbols for /opt/dovecot/lib/dovecot/libdovecot.so.0
> Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done.
> Loaded symbols for /lib64/libdl.so.2
> Reading symbols from /lib64/librt.so.1...(no debugging symbols found)...done.
> Loaded symbols for /lib64/librt.so.1
> Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
> Loaded symbols for /lib64/libc.so.6
> Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
> Loaded symbols for /lib64/ld-linux-x86-64.so.2
> Reading symbols from /lib64/libpthread.so.0...(no debugging symbols found)...done.
> Loaded symbols for /lib64/libpthread.so.0
> Core was generated by `dovecot/pop3'.
> Program terminated with signal 11, Segmentation fault.
> #0 0x00002b52e1027e54 in istream_raw_mbox_get_start_offset () from /opt/dovecot/lib/dovecot/libdovecot-storage.so.0
> (gdb) bt full
> #0 0x00002b52e1027e54 in istream_raw_mbox_get_start_offset () from /opt/dovecot/lib/dovecot/libdovecot-storage.so.0
> No symbol table info available.
> #1 0x00002b52e102b759 in ?? () from /opt/dovecot/lib/dovecot/libdovecot-storage.so.0
> No symbol table info available.
> #2 0x00002b52e100a2c0 in index_mail_expunge () from /opt/dovecot/lib/dovecot/libdovecot-storage.so.0
> No symbol table info available.
> #3  0x0000000000405e9c in client_update_mails ()
> No symbol table info available.
> #4  0x00000000004061c1 in client_command_execute ()
> No symbol table info available.
> #5  0x00000000004045b9 in client_handle_input ()
> No symbol table info available.
> #6 0x00002b52e12df698 in io_loop_call_io () from /opt/dovecot/lib/dovecot/libdovecot.so.0
> No symbol table info available.
> #7 0x00002b52e12e09d5 in io_loop_handler_run () from /opt/dovecot/lib/dovecot/libdovecot.so.0
> No symbol table info available.
> #8 0x00002b52e12df62d in io_loop_run () from /opt/dovecot/lib/dovecot/libdovecot.so.0
> No symbol table info available.
> #9 0x00002b52e12cdf13 in master_service_run () from /opt/dovecot/lib/dovecot/libdovecot.so.0
> No symbol table info available.
> #10 0x0000000000403994 in main ()
> No symbol table info available.
> (gdb)
>
> All traces of the crashes are identical, that is
> #0 0x00002b52e1027e54 in istream_raw_mbox_get_start_offset () from /opt/dovecot/lib/dovecot/libdovecot-storage.so.0 > #1 0x00002b52e102b759 in ?? () from /opt/dovecot/lib/dovecot/libdovecot-storage.so.0 > #2 0x00002b52e100a2c0 in index_mail_expunge () from /opt/dovecot/lib/dovecot/libdovecot-storage.so.0
> #3  0x0000000000405e9c in client_update_mails ()
> #4  0x00000000004061c1 in client_command_execute ()
> #5  0x00000000004045b9 in client_handle_input ()
> #6 0x00002b52e12df698 in io_loop_call_io () from /opt/dovecot/lib/dovecot/libdovecot.so.0 > #7 0x00002b52e12e09d5 in io_loop_handler_run () from /opt/dovecot/lib/dovecot/libdovecot.so.0 > #8 0x00002b52e12df62d in io_loop_run () from /opt/dovecot/lib/dovecot/libdovecot.so.0 > #9 0x00002b52e12cdf13 in master_service_run () from /opt/dovecot/lib/dovecot/libdovecot.so.0
> #10 0x0000000000403994 in main ()
>
> We have mboxes over NFS and we also have an ldap user backend. For now, I do not have a scenario > that reproduces the problem. Any idea, or input are highly appreciated. Of course I can provide > any information requested (without exposing restricted company or client data) to help trace
> the problem and lead to the solution.
>
> Thanks and keep up the good work!
>
> Regards,
>
> Kostas Zorbadelos
>
>







Reply via email to