On Wed, 27 Jun 2007 23:15:32 +0300 Timo Sirainen <[EMAIL PROTECTED]> wrote: > On Thu, 2007-06-21 at 16:49 +0900, Christian Balzer wrote: > > > You could try > > > http://dovecot.org/patches/debug/mempool-accounting.diff and send > > > USR1 signal to dovecot-auth after a while. It logs how much memory > > > is used by all existing memory pools. Each auth request has its own > > > pool, so if it's really leaking them it's probably logging a lot of > > > lines. If not, then the leak is elsewhere. > > > > > I grabbed the Debian package source on a test machine (not gonna chance > > anything on the production servers), applied the patch, did add > > --enable-debug to the debian/rules file (and got the #define DEBUG > > in config.h), created the binary packages, installed, configured, > > started them, tested a few logins and... nothing gets logged > > in mail.* if I send a USR1 to dovecot-auth. Anything I'm missing? > > Bug, fixed: http://hg.dovecot.org/dovecot-1.0/rev/a098e94cd318 > Thanks, that fixed the silence of the auth-sheep.
This is the output after start-up: --- Jul 2 13:59:54 engtest03 dovecot: auth(default): pool auth request handler: 104 / 4080 bytes Jul 2 13:59:54 engtest03 last message repeated 19 times Jul 2 13:59:54 engtest03 dovecot: auth(default): pool passwd_file: 56 / 10224 bytes Jul 2 13:59:54 engtest03 dovecot: auth(default): pool Environment: 224 / 2032 bytes Jul 2 13:59:54 engtest03 dovecot: auth(default): pool ldap_connection: 576 / 1008 bytes Jul 2 13:59:54 engtest03 dovecot: auth(default): pool auth: 1520 / 2032 bytes --- Used memory of dovecot-auth after 1 login was 3148KB(RSS). This is after a good trashing with rabid (from the postal package), with just 2 users though, using POP3 logins: --- Jul 2 14:12:30 engtest03 dovecot: auth(default): pool auth request handler: 104 / 4080 bytes Jul 2 14:12:30 engtest03 last message repeated 128 times Jul 2 14:12:30 engtest03 dovecot: auth(default): pool passwd_file: 56 / 10224 bytes Jul 2 14:12:30 engtest03 dovecot: auth(default): pool Environment: 224 / 2032 bytes Jul 2 14:12:30 engtest03 dovecot: auth(default): pool ldap_connection: 576 / 1008 bytes Jul 2 14:12:30 engtest03 dovecot: auth(default): pool auth: 1520 / 2032 bytes --- Note that the amount of auth request handler pools have grown to 128. After another short round of rabid the handler pools grew to 137 and the size of dovecot-auth to 5100KB. The number of handler pools never fell, nor did the memory footprint, obviously. :-p At about 800k logins/day/node here it's obvious now why dovecot-auth explodes after less than a week with max size of 512MB. > > But no matter, it is clearly leaking just as bad as 0.99 and I venture > > that his is the largest installation with LDAP as authentication > > backend. I wonder if this leak would be avoided by having LDAP lookups > > performed by worker processes as with SQL. > > Then you'd only have multiple leaking worker processes. > Yes, I realize that. But I presume since these are designed to die off and be recreated on the fly the repercussions would be much better. ;) Of course now it looks like this is not LDAP related after all. > > > The same as 0.99. You could also kill -HUP dovecot when dovecot-auth > > > is nearing the limit. That makes it a bit nicer, although not > > > perfectly safe either (should fix this some day..). > > > > > If that leak can't be found I would very much appreciate a solution > > that at least avoids failed and/or delayed logins. > > That would require that login processes don't fail logins if connection > to dovecot-auth drops, but instead wait until they can connect back to > it and try again. And maybe another alternative would be to just > disconnect the client instead of giving login failure. > Anything that fixes this one way or the other would be nice. ^_^ Oh and HUP'ing the master is not an option here, I guess the system load triggers a race condition in dovecot because several times when doing so I got this: --- Jun 22 15:08:58 mb11 dovecot: listen(143) failed: Interrupted system call --- Which results in a killed off dovecot, including all active sessions. The self terminating dovecot-auth is not nice, but at least more predictable and does recover by itself: --- Jun 30 19:03:27 mb12 dovecot: auth(default): pool_system_malloc(): Out of memory Jun 30 19:03:27 mb12 dovecot: child 11110 (auth) returned error 83 (Out of memory) Jun 30 19:03:28 mb12 dovecot: pop3-login: Can't connect to auth server at default: Resource temporarily unavailable Jun 30 19:03:28 mb12 last message repeated 11 times --- Of course the 12 users that tried to log in at this time are probably not amused or at least confused. Regards, Christian -- Christian Balzer Network/Systems Engineer NOC [EMAIL PROTECTED] Global OnLine Japan/Fusion Network Services http://www.gol.com/