Hi Denis, On Jun 16 22:39, Denis Excoffier wrote: > Hello, > > I’ve exercised ‘getent' a little bit those days (with 'db_enum: all’ > in /etc/nsswitch.conf), and it seems to me that the timeout ‘tv' (3 > seconds, in ldap.cc) is probably too small for servers not so quickly > responsive or with many (500000, fake or real) users around (see the > call to ldap_get_next_page_s()). 300 seconds should be enough i > suppose.
300 seconds is a lot. I'm not quite sure I'm following you here. Let me start by explaining how the timeout is applied so we're all on the same page. When opening the conection to the DC, the bind operatrion will wait 3 seconds for the bind operation to complete. In calls to getpwnam, getpwuid, getgrname, getgrgid, the 3 seconds timeout is the timeout for fetching a single user or group entry. And it's not the timeout for fetching the basic info (name<->SID mapping), but only the timeout for the LDAP call returning the extended user info (pgid, gecos, home, shell). So, typically the user<->uid mapping is correct, only the secondary info might be wrong, if it's set in AD at all. When enumerating accounts (getpwent, getgrent) the timeout is applied to every call fetching the next 100 accounts. The *only* information which is actually enumerated is the list of existing SIDs, with a timeout of 3 seconds per 100 SIDs. So it's taking more than 3 seconds to fetch 100 account SIDs? And then... > Also it is a pity that LDAP_TIMEOUT is not announced to the user > (except under strace: 0x55). I don’t know the general policy for > timeouts, but i consider that the user would like to be informed when > the passwd/group list was truncated. ...you really get an LDAP_TIMEOUT from ldap_get_next_page_s? This puzzels me a bit since the documentation implies that tyhis won't happen. Here's the snippet from MSDN: When parsing the results set, it is possible for the server to return an empty page of results and yet still respond with an LDAP_SUCCESS return code. This indicates that the server was unable to retrieve a page of results, due to a timeout or other reason, but has not completed the search request. The proper behavior in this instance is to continue to call ldap_get_next_page_s until either another page of results are successfully retrieved, an error code is returned, or LDAP_NO_RESULTS_RETURNED is returned to indicate the search is complete. So I expect an LDAP_SUCCESS with ldap_count_entries() == 0 and then repeat the request. But the code doesn't expect LDAP_TIMEOUT in this case. Do I have to handle LDAP_TIMEOUT here as well? As far as propagating the timeout to the user, that's kind of tricky. I'm not looking forward to do that, but if so, it could only be an EIO error returned from get{pw,gr}ent. The general problem with timeouts is that they are always wrong. I'm wondering if the timeout, at least for enumerating accounts, should go away entirely. In case of a connection problem this could result in a hang for about 2 minutes by default I think (LDAP_OPT_PING_LIMIT). I could also raise the timeout, but the value doesn't really matter, it will be just as wrong as 3 seconds, just differently. Thoughts? > Another (unrelated and less important) problem is that 'getent' > happily produces lines with some extra ‘:’, in particular when the > gecos field itself contains ‘:’. Wow, that *is* important. All fields returned from the server have to get their colons converted to commas. I'll fix that. Thanks, Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Maintainer cygwin AT cygwin DOT com Red Hat
pgpEGVJEHmmv8.pgp
Description: PGP signature