Hi,

We're now running with ldap via haproxy, as was suggested in this thread by Timo. So far, so good: it seems to work very well.

MJ

On 03/10/2016 04:15 PM, Gordon Grubert wrote:
Hi Timo,

On 01.03.2016 22:51, Timo Sirainen wrote:
On 29 Feb 2016, at 17:18, Gordon Grubert
<gordon.grubert+li...@uni-greifswald.de> wrote:

Hi,

we are using a round robin dns record for connections to our ldap
system. This works fine for almost all cases. In particular, for
dovecot does this mean, when an ldap server is stopped, dovecot
instantly reconnects to another ldap server.

But when the network connection to the active ldap server is broken,
dovecot sticks to the failed ldap server. Is there any possibility to
define a connection timeout?

What should happen is that as long as new requests keep coming,
Dovecot realizes after about 60 seconds that the LDAP server is
hanging. It then reconnects and the reconnection should work. But...
First of all, 60 seconds is likely a much too long timeout.

But more importantly it looks like there's something weird now going
on with OpenLDAP library. I added this somewhat recently and tested
that it works:

https://github.com/dovecot/core/commit/fb3178a1924dae52151d88c4d4ded879df43dd3f


But now that I'm testing it, the timeout doesn't seem to be
triggering. I don't know what happened to it that it suddenly doesn't
work.. This also means that OpenLDAP seems to be internally stuck
trying to connect to a server that isn't responding. Dovecot doesn't
currently make the decisions on which LDAP server to connect to. It
just passes through all the hosts to OpenLDAP library and lets it
handle it. And it seems like OpenLDAP library can't right now do this
failover. So maybe Dovecot should be responsible for that as well..

Anyway, for now you could set up haproxy to localhost and configure
Dovecot LDAP to connect to haproxy and haproxy connect to the actual
LDAP servers.


today I've upgraded to 2.2.21-1~auto+171 on debian 8 and made a lot of
"interruption tests". Your fix not really solved the problem.

But I found another interesting fact: The openldap client on debian 8
can handle hard communication interrupts correctly. I've added

NETWORK_TIMEOUT 5
TIMEOUT         5

to ldap.conf because man 5 ldap.conf says:

NETWORK_TIMEOUT <integer>
    Specifies the timeout (in seconds) after which the poll(2)/select(2)
    following a connect(2) returns in case of no activity.

TIMEOUT <integer>
    Specifies  a  timeout  (in  seconds)  after  which  calls to
    synchronous LDAP APIs will abort if no response is received.  Also
    used for any ldap_result(3) calls where a NULL timeout parameter is
    supplied.

We are using the ISC DHCP server with dynamic ldap connections. This
daemon uses - like dovecot - the LDAP API of the openldap client for
access to the ldap server. The DHCP opens a persistent ldap connection
to handle all dhcp requests (same behavior like dovecot). Here, the
timeouts for connection loss are working.

Therefore, my question: Why does this not work for dovecot, too, when
dovecot uses the same API? Dovecot does not get a response from the
LDAP server and has to reconnect, only.

IMAP server world domination requires a reconnect in case of connection
timeouts ;-)

Best regards,
Gordon

Reply via email to